I'll just confirm what others said on this thread -- this is not a security issue.
Zak -- there seems to be an escaping/encoding problem in the UI though -- we don't want user input to "break" or "manipulate" UI-generated CQL queries. I think the UI should escape all special CQL chars when constructing CQL queries, including double-quotes, to prevent this.
Marc and Tod -- as far as potential backend vulnerabilities go, I don't think we should be using the term "CQL injection" as there's no such thing. CQL is exposed in the public APIs (it's part of the API). The specific concerns Tod listed (2 and 3) are worth investigating though and they would apply to API design in general, they are not specific to using CQL. We probably need to dig deeper into those but here are my initial thoughts (again this is not related to the problem reported by Kevin):
1. reading sensitive data that the user does not have no access to
In FOLIO we don't use DB row-level nor field-level security. Platform authorization (permissions) is enforced per API endpoint -- either the user has access to one or not. FOLIO has a "hook" for implementing field- or row-level security with the so-called "desired permissions" -- not enforced by the platform but passed down to the module -- but it's left to the module to implement and enforce. What this means in practice is that if you wish to have certain fields more "sensitive" than others (e.g password field in the user record) and have it accessible only to a subset of users you have two options:
a) implement access control in the module --- e.g add a new "desired permission" called "user.item.get.sensitive". This permission will not be enforced in the gateway but it will be passed down to the module in the headers. The module must then verify the permission and strip/include sensitive data accordingly. Afaik, no module in FOLIO actually does this (mod-login being an exception)
b) structure the API so that the sensitive information is accessible under a separate endpoint -- e.g assuming user is retrieved through GET /user, the sensitive portion would be in GET /user/sensitive. It would need it's own JSON schema (there must be one for each endpoint) either for the entire record or just the sensitive part. In this case the permission to read sensitive data is a "required permission" and it's enforced by the gateway
That's the basics and in the case when a module implements its own access control -- it's hard to reason about what may happen (anything?).
In the platform-enforced access control case, there is a risk of "leaking" information from other "endpoints" through a bad module implementation -- e.g when processing a request the module may read something sensitive from it's DB and append to the response. I guess the only way, I can imagine, we could truly prevent bad implementations is by granulary enforcing data access in the DB (e.g a separate ROLE for each table) and mapping that up to the endpoint -- e.g request to access /user has only privileges to access the USER table. This would be an extremely complex and elaborate DB ROLE scheme though and I've never seen that used in practice. I guess it's possible in theory though. We would still need to figure out who does the ROLE "switching", we obviously can't leave it up to the module -- if we assume the module may include "bad implementation" it could also break switching.
So that's in general, fortunately in FOLIO most modules use a similar approach for API impl and DB access (through RMB) and it's a bit easier to reason what happens there.
For the sake of argument, assume the module exposes two endpoints /instances and /items, items being "linked" to the instances through a foreing-key (instanceId), each with individual access permissions (instances.list.get and items.list.get).
Tod already noted that CQL is used only for filtering down result sets, it is not used for data manipulation (though it can be combined with data manipulation functions as a filtering mechanism to narrow down what needs to be manipulated). I'll add that CQL is also NOT used to control what information is retrieved back to the user -- it only controls if a bit of information is included in the result or not. In SQL terms, CQL only controls what goes in after the WHERE clause, it does not control what is specified right after the SELECT clause, in short: SELECT something FROM somewhere WHERE <criteria based on CQL>. So with CQL per se it's not possible to actually READ any information that were not explicitly made available in the API endpoint. In the above example -- you can't use CQL to read items through the /instances endpoint. For that you would need to have a new endpoint with a separate JSON Schema and with a new permission called e.g /instances-with-items. Since CQL only controls what goes after the WHERE clause it cannot manipulate the retrieval portion of your SELECT and include anything beyond what the endpoint (which usually directly maps to a table or a view) specifies. (Side note: this is also the major difference between CQL and GraphQL and also a major source of frustration by FOLIO developers, I wonder how does GraphQL deal with this "problem"?)
What CQL can do, however, is to filter/match on properties from "related" entities, if and only if, such link has been established in the so-called "schema.json" configuration file (I'm assuming RMB here, ERM uses a different CQL impl). Because of this, I guess, one could find out if a piece of information, he or she does not have direct access to, is present in the system or not, though still not read it. E.g I could find out if there is an item with a barcode "xyz" without having access to the /items endpoint but only having access to the /instances endpoint by querying it with "item.barcode='xyz'". I can't quite tell if this is a vulnerability or not. I guess there might be examples where this undesirable. I should stress though that for this search to work, the link must be explicit in "schema.json" -- and it sounds like we should avoid casually linking sensitive information this way anyway.
2. slowing or taking a module down
This I think is a practical threat, though not specific to CQL -- we currently have such "vulnerabilities" present in the system through the unbounded "limit" parameter, if the "limit" is set too high attacker may shut the module down with OOM. Platform team is looking for a solution to the "limit" issue under
https://issues.folio.org/browse/RMB-535. In the CQL case, a simple "fix" could be limiting the size of the query (long queries may take more time/resources) and, which Marc already mentioned, rejecting queries that are not backed up by an explicit DB index -- currently we only flag such queries with WARN messages but pass them through as there are many "small" modules where lack of indexing still results in acceptable performance.