APIs Reference

Search Documents

APIs to search documents in a Collection in OramaCore.

APIs

API Key type: read_api_key. Safe to expose to the public.

To search for documents in a collection, you can use the following API:

curl -X POST \
  http://localhost:8080/v1/collections/{COLLECTION_ID}/search?api-key=<read_api_key> \
  -H 'Content-Type: application/json' \
  -d '{ "term": "The quick brown fox" }'

The API will return a list of documents that match the search term. The documents will be sorted by relevance, with the most relevant documents appearing first.

Search Parameters

When performing search, you can use a number of parameters to customize the search results:

ParameterDescriptionDefault
termThe search term.-
modeThe search mode. Can be fulltext, vector, or hybrid.fulltext
limitThe maximum number of documents to return.10
offsetThe number of documents to skip.0
propertiesThe properties to search in.
Should be an array of strings (for example: ["title", "description", "author.name"])
All properties
facetsA list of facets to return. Read more here-
whereA filter to apply to the search results. Read more here-
thresholdThe percentage of matches required to return a document. Read more here0
exactWhether to use exact matching.false

Where Filters

At index time, OramaCore will index different datatypes in different ways. For example, a string will be indexed differently than a number or a boolean.

When performing a search, you can use the where parameter to filter the search results based on the datatype of the property.

Filtering Strings

OramaCore does not support filtering strings with more than 25 ASCII characters.

To filter strings, you can use the following API:

{
  "term": "John Doe",
  "where": {
    "job": "Software Engineer"
  }
}
const results = await collection.search({
  term: 'John Doe',
  where: {
    job: 'Software Engineer'
  }
})

Filtering Numbers

To filter numbers, you can use the following operators:

OperatorDescriptionExample
eqEqual to{"where": {"age": {"eq": 25}}}
ltLess than{"where": {"age": {"lt": 25}}}
lteLess than or equal to{"where": {"age": {"lte": 25}}}
gtGreater than{"where": {"age": {"gt": 25}}}
gteGreater than or equal to{"where": {"age": {"gte": 25}}}
betweenBetween two values{"where": {"age": {"between": [20, 30]}}}

So a full query complete with a where filter might look like this:

{
  "term": "John Doe",
  "where": {
    "age": {
      "gte": 25
    }
  }
}
const results = await collection.search({
  term: 'John Doe',
  where: {
    age: {
      gte: 25
    }
  }
})

Filtering Booleans

To filter booleans, you can use the following operators:

OperatorDescriptionExample
trueTrue{"where": {"is_active": true}}
falseFalse{"where": {"is_active": false}}

So a full query complete with a where filter might look like this:

{
  "term": "John Doe",
  "where": {
    "is_active": true
  }
}
const results = await collection.search({
  term: 'John Doe',
  where: {
    is_active: true
  }
})

Facets

OramaCore supports faceted search. You can use the facets parameter to get a list of facets for a given property.

Numeric Facets

The facets parameter can be used to get numeric facets. For example, to get a histogram of the price property, you can use the following query:

{
  "term": "Bluetooth Airbuds",
  "facets": {
    "price": {
      "ranges": [
        { "from": 0, "to": 50 },
        { "from": 50, "to": 100 },
        { "from": 100, "to": 200 },
        { "from": 200, "to": 500 },
        { "from": 500, "to": 1000 },
        { "from": 1000 }
      ]
    }
  }
}
const results = await collection.search({
  term: 'Bluetooth Airbuds',
  facets: {
    price: {
      ranges: [
        { from: 0, to: 50 },
        { from: 50, to: 100 },
        { from: 100, to: 200 },
        { from: 200, to: 500 },
        { from: 500, to: 1000 },
        { from: 1000 }
      ]
    }
  }
})

Boolean Facets

The facets parameter can also be used to get boolean facets. For example, to get a list of available values, you can use the following query:

{
  "term": "Bluetooth Airbuds",
  "facets": {
    "available": {
      "true": true,
      "false": false
    }
  }
}
const results = await collection.search({
  term: 'Bluetooth Airbuds',
  facets: {
    available: {
      true: true,
      false: false
    }
  }
})

Understanding the Orama Threshold Property

The threshold property in Orama controls the minimum/maximum number of results to return when performing a search operation. It helps filter out potentially irrelevant results, especially with long search queries.

Example Data

Let's consider these four documents:

[
  { "title": "Blue t-shirt, slim fit" },
  { "title": "Blue t-shirt, regular fit" },
  { "title": "Red t-shirt, slim fit" },
  { "title": "Red t-shirt, oversize fit" }
]

Search Behavior Without Threshold

If we search for regular fit:

{
  "term": "regular fit"
}

OramaCore will return:

{
  "count": 4, // 4 results!
  "hits": [...],
  "elapsed": {...}
}

Why four results? While only one document contains the exact phrase "regular fit", OramaCore returns all documents that match any of the search terms. In this case, all documents contain the word "fit", so they're all included in the results.

How Threshold Works

The threshold property is a number between 0 and 1 representing the percentage of matching terms required for a document to be included in results:

  • threshold: 0 (default) - Returns all documents matching ANY search term
  • threshold: 1 - Returns only documents matching ALL search terms
  • threshold: 0.5 - Returns documents with at least 50% of search terms

Examples

With threshold: 0 (default)

{
  "term": "slim fit"
}

Returns all documents containing either "slim" OR "fit" (all 4 documents in our example).

With threshold: 1

{
  "term": "slim fit",
  "threshold": 1
}

Returns only documents containing BOTH "slim" AND "fit" (only the 2 documents with "slim fit").

With threshold: 0.5

{
  "term": "slim fit",
  "threshold": 0.5
}

Prioritizes documents containing both "slim" and "fit", then returns 50% of documents containing either term.

Real-World Application

For large document collections (e.g., 1 million documents), using an appropriate threshold becomes crucial. Long search queries like "red t-shirt with long sleeves and a motorbike printed on the front" could match too many irrelevant documents without a proper threshold setting.

Last updated on