App Search APIsedit
On this page
Initializing the Clientedit
The AppSearch
client can either be configured directly:
# Use the AppSearch client directly: from elastic_enterprise_search import AppSearch app_search = AppSearch( "http://localhost:3002", http_auth="private-..." ) # Now call API methods app_search.search(...)
…or can be used via a configured EnterpriseSearch.app_search
instance:
from elastic_enterprise_search import EnterpriseSearch ent_search = EnterpriseSearch("http://localhost:3002") # Configure authentication of the AppSearch instance ent_search.app_search.http_auth = "private-..." # Now call API methods ent_search.app_search.search(...)
API Key Privilegesedit
Using the APIs require a key with read
, write
or search
access
depending on the action. If you’re receiving an UnauthorizedError
make sure the key you’re using in http_auth
has the proper privileges.
Engine APIsedit
Engines index documents and perform search functions. To use App Search you must first create an Engine.
Create Engineedit
Let’s create an Engine named national-parks
and uses English
as a language:
# Request: app_search.create_engine( engine_name="national-parks", language="en", ) # Response: { "name": "national-parks", "type": "default", "language": "en" }
Get Engineedit
Once we’ve created an Engine we can look at it:
# Request: app_search.get_engine(engine_name="national-parks") # Response: { "document_count": 0, "language": "en", "name": "national-parks", "type": "default" }
List Enginesedit
We can see all our Engines in the App Search instance:
# Request: app_search.list_engines() # Response: { "meta": { "page": { "current": 1, "size": 25, "total_pages": 1, "total_results": 1 } }, "results": [ { "document_count": 0, "language": "en", "name": "national-parks", "type": "default" } ] }
Delete Engineedit
If we want to delete the Engine and all documents
inside you can use the delete_engine()
method:
# Request: app_search.delete_engine(engine_name="national-parks") # Response: { "deleted": True }
Document APIsedit
Create and index Documentsedit
Once you’ve created an Engine you can start adding documents
with the index_documents()
method:
# Request: app_search.index_documents( engine_name="national-parks", documents=[{ "id": "park_rocky-mountain", "title": "Rocky Mountain", "nps_link": "https://www.nps.gov/romo/index.htm", "states": [ "Colorado" ], "visitors": 4517585, "world_heritage_site": False, "location": "40.4,-105.58", "acres": 265795.2, "date_established": "1915-01-26T06:00:00Z" }, { "id": "park_saguaro", "title": "Saguaro", "nps_link": "https://www.nps.gov/sagu/index.htm", "states": [ "Arizona" ], "visitors": 820426, "world_heritage_site": False, "location": "32.25,-110.5", "acres": 91715.72, "date_established": "1994-10-14T05:00:00Z" }] ) # Response: [ { "errors": [], "id": "park_rocky-mountain" }, { "errors": [], "id": "park_saguaro" } ]
List Documentsedit
Both of our new documents indexed without errors.
Now we can look at our indexed documents in the engine:
# Request: app_search.list_documents(engine_name="national-parks") # Response: { "meta": { "page": { "current": 1, "size": 100, "total_pages": 1, "total_results": 2 } }, "results": [ { "acres": "91715.72", "date_established": "1994-10-14T05:00:00Z", "id": "park_saguaro", "location": "32.25,-110.5", "nps_link": "https://www.nps.gov/sagu/index.htm", "states": [ "Arizona" ], "title": "Saguaro", "visitors": "820426", "world_heritage_site": "false" }, { "acres": "265795.2", "date_established": "1915-01-26T06:00:00Z", "id": "park_rocky-mountain", "location": "40.4,-105.58", "nps_link": "https://www.nps.gov/romo/index.htm", "states": [ "Colorado" ], "title": "Rocky Mountain", "visitors": "4517585", "world_heritage_site": "false" } ] }
Get Documents by IDedit
You can also retrieve a set of documents by their id
with
the get_documents()
method:
# Request: app_search.get_documents( engine_name="national-parks", document_ids=["park_rocky-mountain"] ) # Response: [ { "acres": "265795.2", "date_established": "1915-01-26T06:00:00Z", "id": "park_rocky-mountain", "location": "40.4,-105.58", "nps_link": "https://www.nps.gov/romo/index.htm", "states": [ "Colorado" ], "title": "Rocky Mountain", "visitors": "4517585", "world_heritage_site": "false" } ]
Update existing Documentsedit
You can update documents with the put_documents()
method:
# Request: resp = app_search.put_documents( engine_name="national-parks", documents=[{ "id": "park_rocky-mountain", "visitors": 10000000 }] ) # Response: [ { "errors": [], "id": "park_rocky-mountain" } ]
Delete Documentsedit
You can delete documents from an Engine with the delete_documents()
method:
# Request: resp = app_search.delete_documents( engine_name="national-parks", document_ids=["park_rocky-mountain"] ) # Response: [ { "deleted": True, "id": "park_rocky-mountain" } ]
Schema APIsedit
Now that we’ve indexed some data we should take a look at the way the data is being indexed by our Engine.
Get Schemaedit
First take a look at the existing Schema inferred from our data:
# Request: resp = app_search.get_schema( engine_name="national-parks" ) # Response: { "acres": "text", "date_established": "text", "location": "text", "nps_link": "text", "states": "text", "title": "text", "visitors": "text", "world_heritage_site": "text" }
Update Schemaedit
Looks like the date_established
field wasn’t indexed
as a date
as desired. Update the type of the date_established
field:
# Request: resp = app_search.put_schema( engine_name="national-parks", schema={ "date_established": "date" } ) # Response: { "acres": "number", "date_established": "date", # Type has been updated! "location": "geolocation", "nps_link": "text", "square_km": "number", "states": "text", "title": "text", "visitors": "number", "world_heritage_site": "text" }
Search APIsedit
Once documents are ingested and the Schema is set properly
you can use the search()
method to search through an Engine
for matching documents.
The Search API has many options, read the Search API documentation for a list of all options.
Single Searchedit
# Request: resp = app_search.search( engine_name="national-parks", body={ "query": "rock" } ) # Response: { "meta": { "alerts": [], "engine": { "name": "national-parks-demo", "type": "default" }, "page": { "current": 1, "size": 10, "total_pages": 2, "total_results": 15 }, "request_id": "6266df8b-8b19-4ff0-b1ca-3877d867eb7d", "warnings": [] }, "results": [ { "_meta": { "engine": "national-parks-demo", "id": "park_rocky-mountain", "score": 6776379.0 }, "acres": { "raw": 265795.2 }, "date_established": { "raw": "1915-01-26T06:00:00+00:00" }, "id": { "raw": "park_rocky-mountain" }, "location": { "raw": "40.4,-105.58" }, "nps_link": { "raw": "https://www.nps.gov/romo/index.htm" }, "square_km": { "raw": 1075.6 }, "states": { "raw": [ "Colorado" ] }, "title": { "raw": "Rocky Mountain" }, "visitors": { "raw": 4517585.0 }, "world_heritage_site": { "raw": "false" } } ] }
Multi Searchedit
Multiple searches can be executed at the same time with the multi_search()
method:
# Request: resp = app_search.multi_search( engine_name="national-parks", body={ "queries": [ {"query": "rock"}, {"query": "lake"} ] } ) # Response: [ { "meta": { "alerts": [], "engine": { "name": "national-parks-demo", "type": "default" }, "page": { "current": 1, "size": 1, "total_pages": 15, "total_results": 15 }, "warnings": [] }, "results": [ { "_meta": { "engine": "national-parks", "id": "park_rocky-mountain", "score": 6776379.0 }, "acres": { "raw": 265795.2 }, "date_established": { "raw": "1915-01-26T06:00:00+00:00" }, "id": { "raw": "park_rocky-mountain" }, "location": { "raw": "40.4,-105.58" }, "nps_link": { "raw": "https://www.nps.gov/romo/index.htm" }, "square_km": { "raw": 1075.6 }, "states": { "raw": [ "Colorado" ] }, "title": { "raw": "Rocky Mountain" }, "visitors": { "raw": 4517585.0 }, "world_heritage_site": { "raw": "false" } } ] }, ... ]
Curation APIsedit
Curations hide or promote result content for pre-defined search queries.
Create Curationedit
# Request: resp = app_search.create_curation( engine_name="national-parks", queries=["rocks", "rock", "hills"], promoted_doc_ids=["park_rocky-mountains"], hidden_doc_ids=["park_saguaro"] ) # Response: { "id": "cur-6011f5b57cef06e6c883814a" }
Get Curationedit
# Request: resp = app_search.get_curation( engine_name="national-parks", curation_id="cur-6011f5b57cef06e6c883814a" ) { "hidden": [ "park_saguaro" ], "id": "cur-6011f5b57cef06e6c883814a", "promoted": [ "park_rocky-mountains" ], "queries": [ "rocks", "rock", "hills" ] }
Update Curationedit
# Request: app_search.put_curation( engine_name='my-engine', curation_id='cur-6011f5b57cef06e6c883814a', queries=["foo", "bar"], promoted=["doc-1", "doc-2"], hidden=["doc-3"] ) # Response: { "id": "cur-6011f5b57cef06e6c883814a" }
List Curationsedit
# Request: app_search.list_curations( engine_name="national-parks" )
Delete Curationedit
# Request: app_search.delete_curation( engine_name="national-parks", curation_id="cur-6011f5b57cef06e6c883814a" )
Meta Engine APIsedit
Meta Engines is an Engine that has no documents of its own, instead it combines multiple other Engines so that they can be searched together as if they were a single Engine.
The Engines that comprise a Meta Engine are referred to as "Source Engines".
Create Meta Engineedit
Creating a Meta Engine uses the create_engine()
method
and set the type
parameter to "meta"
.
# Request: app_search.create_engine( engine_name="meta-engine", type="meta", source_engines=["national-parks"] ) # Response: { "document_count": 1, "name": "meta-engine", "source_engines": [ "national-parks" ], "type": "meta" }
Searching Documents from a Meta Engineedit
# Request: app_search.search( engine_name="meta-engine", body={ "query": "rock" } ) # Response: { "meta": { "alerts": [], "engine": { "name": "meta-engine", "type": "meta" }, "page": { "current": 1, "size": 10, "total_pages": 1, "total_results": 1 }, "request_id": "aef3d3d3-331c-4dab-8e77-f42e4f46789c", "warnings": [] }, "results": [ { "_meta": { "engine": "national-parks", "id": "park_black-canyon-of-the-gunnison", "score": 2.43862 }, "id": { "raw": "national-parks|park_black-canyon-of-the-gunnison" }, "nps_link": { "raw": "https://www.nps.gov/blca/index.htm" }, "square_km": { "raw": 124.4 }, "states": { "raw": [ "Colorado" ] }, "title": { "raw": "Black Canyon of the Gunnison" }, "world_heritage_site": { "raw": "false" } } ] }
Notice how the id
of the result we receive (national-parks|park_black-canyon-of-the-gunnison
)
includes a prefix of the Source Engine that the result is from to distinguish them from
results with the same id
but different Source Engine within a search result.
Adding Source Engines to an existing Meta Engineedit
If we have an existing Meta Engine named meta-engine
we can add additional Source Engines to it with the
add_meta_engine_source()
method. Here we add the
state-parks
Engine:
# Request: app_search.add_meta_engine_source( engine_name="meta-engine", source_engines=["state-parks"] ) # Response: { "document_count": 1, "name": "meta-engine", "source_engines": [ "national-parks", "state-parks" ], "type": "meta" }
Removing Source Engines from a Meta Engineedit
If we change our mind about state-parks
being a Source Engine for
meta-engine
we can use the delete_meta_source_engines()
method:
# Request: app_search.delete_meta_engine_source( engine_name="meta-engine", source_engines=["state-parks"] ) # Response: { "document_count": 1, "name": "meta-engine", "source_engines": [ "national-parks" ], "type": "meta" }
Web Crawler APIsedit
Domainsedit
# Create a domain resp = app_search.create_crawler_domain( engine_name="crawler-engine", body={ "name": "https://example.com" } ) domain_id = resp["id"] # Get a domain app_search.get_crawler_domain( engine_name="crawler-engine", domain_id=domain_id ) # Update a domain app_search.put_crawler_domain( engine_name="crawler-engine", domain_id=domain_id, body={ ... } ) # Delete a domain app_search.delete_crawler_domain( engine_name="crawler-engine", domain_id=domain_id ) # Validate a domain app_search.get_crawler_domain_validation_result( body={ "url": "https://example.com", "checks": [ "dns", "robots_txt", "tcp", "url", "url_content", "url_request" ] } ) # Extract content from a URL app_search.get_crawler_url_extraction_result( engine_name="crawler-engine", body={ "url": "https://example.com" } ) # Trace a URL app_search.get_crawler_url_tracing_result( engine_name="crawler-engine", body={ "url": "https://example.com" } )
Crawlsedit
# Get the active crawl app_search.get_crawler_active_crawl_request( engine_name="crawler-engine", ) # Start a crawl app_search.create_crawler_crawl_request( engine_name="crawler-engine" ) # Cancel the active crawl app_search.delete_crawler_active_crawl_request( engine_name="crawler-engine" )
Entry Pointsedit
# Create an entry point resp = app_search.create_crawler_entry_point( engine_name="crawler-engine", body={ "value": "/blog" } ) entry_point_id = resp["id"] # Delete an entry point app_search.delete_crawler_entry_point( engine_name="crawler-engine", entry_point_id=entry_point_id )
Crawl Rulesedit
# Create a crawl rule resp = app_search.create_crawler_crawl_rule( engine_name="crawler-engine", domain_id=domain_id, body={ "policy": "deny", "rule": "ends", "pattern": "/dont-crawl" } ) crawl_rule_id = resp["id"] # Delete a crawl rule app_search.delete_crawler_crawl_rule( engine_name="crawler-engine", domain_id=domain_id, crawl_rule_id=crawl_rule_id )
Sitemapsedit
# Create a sitemap resp = app_search.create_crawler_sitemap( engine_name="crawler-engine", domain_id=domain_id, url="https://example.com/sitemap.xml" ) sitemap_id = resp["id"] # Delete a sitemap app_search.delete_crawler_sitemap( engine_name="crawler-engine", domain_id=domain_id, sitemap_id=sitemap_id )
Adaptive Relevance APIsedit
Settingsedit
# Get adaptive relevenace settings for an Engine app_search.get_adaptive_relevance_settings( engine_name="adaptive-engine" ) { "curation": { "enabled": True, "mode": "manual", "timeframe": 7, "max_size": 3, "min_clicks": 20, "schedule_frequency": 1, "schedule_unit": "day" } } # Enable automatic adaptive relevance app_search.put_adaptive_relevance_settings( engine_name="adaptive-engine", curation={ "mode": "automatic" } )
Suggestionsedit
# List all adaptive relevance suggestions for an engine app_search.list_adaptive_relevance_suggestions( engine_name="adaptive-engine" ) { "meta": { "page": { "current": 1, "total_pages": 1, "total_results": 2, "size": 25 } }, "results": [ { "query": "forest", "type": "curation", "status": "pending", "updated_at": "2021-09-02T07:22:23Z", "created_at": "2021-09-02T07:22:23Z", "promoted": [ "park_everglades", "park_american-samoa", "park_arches" ], "operation": "create" }, { "query": "park", "type": "curation", "status": "pending", "updated_at": "2021-10-22T07:34:12Z", "created_at": "2021-10-22T07:34:54Z", "promoted": [ "park_yellowstone" ], "operation": "create", "override_manual_curation": true } ] } # Get adaptive relevance suggestions for a query app_search.get_adaptive_relevance_suggestions( engine_name="adaptive-engine", query="forest", ) { "meta": { "page": { "current": 1, "total_pages": 1, "total_results": 1, "size": 25 } }, "results": [ { "query": "forest", "type": "curation", "status": "pending", "updated_at": "2021-09-02T07:22:23Z", "created_at": "2021-09-02T07:22:23Z", "promoted": [ "park_everglades", "park_american-samoa", "park_arches" ], "operation": "create" } ] } # Update status of adaptive relevance suggestions app_search.put_adaptive_relevance_suggestions( engine_name="adaptive-engine", suggestions=[ {"query": "forest", "type": "curation", "status": "applied"}, {"query": "mountain", "type": "curation", "status": "rejected"} ] )