I question whether this is really an "injection" case at all.

From what I read in the description there is an issue with the CQL Parser that caused it to erroneously interpret a bad request and expand it too much. However, it all remains within the same query. It does not crash out of the original query and then allow the insertion of an arbitrary secondary query. Arguably, the scope of the initial request has been expanded, but that came from intended CQL functionality, it wasn't controlled by malicious terms in the query.

We see:
  1. that in the case of the "normal" query the CQL parser will accept a non-fielded search term and expand it as a data value against a preset number of fields. This is expected CQL behaviour.
  2. that in the case of the "exploit" malformed query a couple of additional things happen.
    1. the leading spaces in front of the OR operator seem to be interpreted as another non-fielded data value (in this case the empty string value) and expanded as before.
    2. the second term is then appended as an OR operation.

In the end the issue is poor parsing from the CQL parser. It should simply have rejected the query altogether, but instead tried to guess and accommodate. There are clearly several things wrong with that attempt.

That other thing about allowing CQL in bulk delete operations sounds very dangerous and we should closely look at that. But that's not even an injection risk either It would seem like an open invitation to submit a naughty CQL query directly. <= Flag that one for the security audit.

The very notion of adopting the relatively obscure CQL with a homegrown parser is problematic.

(BTW adding Craig and Mark directly on CC, since EBSCO's email protections considers this very email distribution list to be a security vulnerability and they would not otherwise see this response.)
_
V

On 2020-01-29 15:40, Marc Johnson wrote:
CAUTION: External E-mail

Hi Tod, I’ll try to answer your questions.

On Wed, 29 Jan 2020 at 20:18, Tod Olson <tod@uchicago.edu> wrote:
I'm trying to think quickly about malicious ways to use CQL injection:

(1) As I understand, CQL is query-only, there is no way to make a CQL query that causes an update to the system. Is that true? No sneaky way to slip in an update? If that is true, good, one concern is addressed.

This is mostly true. 

Increasingly folks have asked for bulk endpoints that use CQL for record selection, either to delete or in unison with some form of transformation.

I believe delete using CQL has been implemented in a limited number of places.


(2) What about reading sensitive information? Could one use injection to retrieve information that should be protected, like user addresses or purchase prices (which are sometimes confidential by contract, especially for large packages of electronic resources)? The injection examples make it seem like one could request any data in a module where one has read privileges. This is probably my main concern.


For the most part, the collection endpoints where CQL can be provided use a coarse grained permission that either allows or disallows access.

Only tokens for users or other modules that have that permission should be able to make requests to that endpoint.

However once they do, they can receive any records in that collection, which might include additional information that they wouldn’t normally have access to.

This last part is fairly standard, for example, loans provided by the business logic API have a small amount of item information included with them, even if the user does not have direct access to items.

It might be that some modules have introduced finer grained permissions that can affect search results or what is included in the records.

I don’t know of an example of that off the top of my head. The only core modules that I think might have done that is in acquisitions.


(3) Could CQL injection be crafted simply to consume inordinate resources, slow the system, or take a module down? 

I think it is possible to construct CQL that could use lots of resources. I’m not aware of specific examples or exploitations. 

At present it would likely take the form of using a CQL index not supported by a PostgreSQL index or a combination of CQL indexes that generates an very expensive SQL query.

I believe the Core Platform team may already have work outlined that could reduce or mitigate some of this. For example, only allowing CQL queries to use indexes that have a corresponding PostgreSQL index defined.

Theoretically, there might be a variant of this where specially crafted CQL could generate injected SQL, in the sense that it exploits the transformation process from CQL to SQL to generate SQL that shouldn’t be possible. This is pure speculation on my part, I’m not aware of any examples of this. It may well be the CQL syntax, parsing or translation to SQL implementation already prevents this.

Jakub may be able to expand upon this area better than I can.


I do think we want to close that injection hole, but for prioritizing I'm trying to focus on potential malicious uses. (2) above makes me a bit nervous for reasons of privacy and contractual obligation.

-Tod

On Jan 29, 2020, at 11:34 AM, Mike Gorrell <mdg@indexdata.com> wrote:

This was raised at the end of today's TC call - a Texas A&M dev has identified an issue that they thought should be raised to the Security Group - which doesn't exist yet.

The Tech Council is the closest thing we have to that security group at this time. I would like to ask this group to weigh in on this issue:


And comment on the issue as a potential security concern as well as how urgently we might want to address it.

Please correspond in email. Feel free to invite others who aren't officially on the TC to be part of this email thread.

Thanks.

-mdg

To unsubscribe from this list please go to http://archives.simplelists.com


To unsubscribe from this list please go to http://www.simplelists.com/confirm.php?u=Tt4ExohmZpH53PqNuyYWDX2GNYnoG2io