---
title: "Building a Stripe-like Search Language from Scratch"
description: "Learn how to build a custom query language & parser using Peggy.js and Github Copilot. See how to securely let users query using Zuplo's programmable API Gateway"
canonicalUrl: "https://zuplo.com/blog/2024/04/27/building-a-stripe-like-search-language-parser"
pageType: "blog"
date: "2024-04-27"
authors: "nate"
tags: "Tutorial, API Best Practices"
image: "https://zuplo.com/og?text=Building%20a%20Stripe-like%20Search%20Language%20from%20Scratch"
---
At Zuplo, we're enthusiastic advocates of the Stripe API. Its consistent and
straightforward design significantly enriches the developer experience. At
Zuplo, we focus on API design since we create tools that aid developers in
deploying exceptional APIs. While developing our REST APIs, we often begin by
considering how Stripe would approach a feature. Although we do not always adopt
their methods, there are instances where emulating a well-implemented strategy
is more efficient than creating a new one from scratch.

Recently, we have been refining our internal APIs to streamline our codebase and
minimize duplication. In the past, we enabled filtering for various list
requests, but our implementation lacked consistency and repeatability.
Initially, we simply added query parameters to API requests, like this:

```txt
GET /accounts/:accountId/users?role=admin
```

This really isn’t a terrible practice, but over time we started to add more of
these filters and we ended up with some pretty ugly APIs that started to have
lots of query parameters.

```txt
GET /accounts/:accoundId/users?role=admin&role=editor&created-after=2023-01-01&created-before=2024-01-01
```

From the perspective of API consumers, this isn't particularly problematic if
the parameters remain consistent. However, for API maintainers, it imposes a
significant burden. Each new query variant demands custom coding to manage the
query strings and update the database queries accordingly. This becomes
increasingly challenging to maintain and test across a large codebase with
numerous endpoints.

To address this, we decided to implement a stripe like search query language and
parser to get a simple and consistent query language across our APIs. We drew
inspiration from the query languages used by
[Github](https://docs.github.com/en/issues/tracking-your-work-with-issues/filtering-and-searching-issues-and-pull-requests#using-search-to-filter-issues-and-pull-requests)
and [Stripe](https://docs.stripe.com/search#search-query-language), both of
which are straightforward yet robust enough to significantly reduce our reliance
on multiple filter query parameters.

:::info

This is probably the point where some people are thinking, why not just use
GraphQL. The short answer is that we prefer REST for most of our APIs. The
[long answer can be read on this post](/learning-center/graphql-vs-rest-the-right-api-design-for-your-audience).

:::

## Building a Parser

We chose to develop a subset of the Stripe Search Language. The first step was
to explore existing open-source libraries that could meet our needs in
Typescript and Go. However, finding none that were suitable, we decided to
create our own.

There are two main approaches to building a parser: manually or using a
generator that produces a parser from a specified language grammar. We opted for
the latter due to its ease of development and the potential for reuse in other
programming languages, like Go.

The language fundamentals are simple and outlined below. A query clause consists
of a field, an operator, and a value:

<table>
  <tr>
    <th>
      <strong>clause</strong>
    </th>
    <td>
      <code>email:"amy@rocketrides.io"</code>
    </td>
  </tr>
  <tr>
    <th>
      <strong>field</strong>
    </th>
    <td>
      <code>email</code>
    </td>
  </tr>
  <tr>
    <th>
      <strong>operator</strong>
    </th>
    <td>
      <code>:</code>
    </td>
  </tr>
  <tr>
    <th>
      <strong>value</strong>
    </th>
    <td>
      <code>amy@rocketrides.io</code>
    </td>
  </tr>
</table>

Negations are handled by prefixing a field-value pair with a `-`, and logical
operators like AND and OR can be used to combine conditions. Basic comparison
operators such as `<`, `>`, `>=`, and `<=` are also supported.

The complete specification is available here:
[Stripe Search Query Language](https://docs.stripe.com/search#search-query-language)

## Writing the Grammar

After deciding on the language specification, we proceeded to define and
generate the grammar. We settled on using
[Peggy.js](https://github.com/peggyjs/peggy), which simplifies the generation of
a parser from a grammar. To initiate, I utilized GitHub Copilot to draft the
initial code.

![Copilot Query](/media/posts/2024-04-27-building-a-stripe-like-search-language-parser/image.png)

After a few iterations with Copilot I had a simple grammar.

```txt
{
  function makeTerm(field, value) {
    return {field: field.join(''), value: value.join('')};
  }
}

Search
  = ws* terms:TermList ws* { return terms; }

TermList
  = head:Term tail:(_ "AND" _ Term)* {
    return [head].concat(tail.map(function(t) { return t[3]; }));
  }

Term
  = ws* field:Field ws* ":" ws* value:Value ws* { return makeTerm(field, value); }

Field
  = chars:[a-zA-Z]+ { return chars; }

Value
  = chars:[a-zA-Z0-9]+ { return chars; }

_ "whitespace"
  = [ \t\n\r]*

ws "whitespace"
  = [ \t\n\r]+
```

Next, I generated the parser using Peggy.js.

```bash
peggy -o parser.mjs --format es grammar.peggy
```

This output the `parser.mjs` file containing the parser. I can then use this to
parse some queries.

```ts
import * as parser from "./parser.mjs";

const result = parser.parse(`account:foo AND project:baz`);
console.dir(result, { depth: null });
```

The parsed output of that query is shown below.

```json
[
  { "field": "account", "value": "foo" },
  { "field": "project", "value": "baz" }
]
```

### Building the Database Query

I won’t go into all the details of how we build the database query from this
parsed output (maybe in another post), but you can imagine how this would be
done using something like Prisma for example.

```ts
import { PrismaClient } from "@prisma/client";
import * as parser from "./parser.mjs";

const prisma = new PrismaClient();

const result = parser.parse(`name:Alice AND email:alice@prisma.io`);
const user = await prisma.account.findFirst({
  where: {
    name: result.name?.value,
    email: result.email?.value,
  },
});
console.log(user);
```

### Safely Expose your Query to Users

When handling untrusted queries, it's crucial to ensure that only permitted
inputs are allowed, both in terms of key/value pairs and query types. To
safeguard our API, we implemented an API Gateway using
[Zuplo](https://zuplo.com) to filter queries based on predefined rules.

In our Zuplo API Gateway setup, we created a route `/users` that accepts a
`search` query parameter. We wrote a custom inbound policy that uses our parser,
and verifies the fields against a list of permitted ones. If a query includes
disallowed fields, the API returns an error.

This method addresses some security concerns related to dynamic queries but does
not cover all potential issues. When implementing this yourself, be sure to test
a variety of scenarios.

```ts
import { ZuploContext, ZuploRequest } from "@zuplo/runtime";
import { parse } from "./parser";

const allowedFields = ["name", "email", "createdOn"];

export default async function policy(
  request: ZuploRequest,
  context: ZuploContext,
) {
  const search = request.query.search;
  const query = parse(search);

  context.log.info(`Initial query`, query);

  const notAllowedFields = query.filter(
    (q) => !allowedFields.includes(q.field),
  );

  if (notAllowedFields.length > 0) {
    context.log.warn(`Fields not allowed`, notAllowedFields);
    return new Response("Query not allowed", { status: 400 });
  }

  return request;
}
```

When I try to call the API with a query that contains a field that is not on the
list of allowed fields, the API returns an error.

![Bad query](/media/posts/2024-04-27-building-a-stripe-like-search-language-parser/image-1.png)

## Conclusion

In this article, I showed you how to build a custom query syntax and parser
using modern tools like Peggy.js, GitHub Copilot, and Zuplo, enhancing both the
developer and user experience by simplifying API queries. Hopefully you find
this helpful in your own projects.