Elasticsearch (Search service)
Contents:
Elasticsearch is a distributed RESTful search engine built for the cloud.
See the Elasticsearch documentation for more information.
Supported versions
Grid | Dedicated |
---|---|
|
|
Deprecated versions
The following versions are available but are not receiving security updates from upstream, so their use is not recommended. They will be removed at some point in the future.
Grid | Dedicated |
---|---|
|
|
Relationship
The format exposed in the $PLATFORM_RELATIONSHIPS
environment variable:
{
"username": null,
"scheme": "http",
"service": "elasticsearch77",
"fragment": null,
"ip": "169.254.57.6",
"hostname": "jmgjydr275pkj5v7prdj2asgxm.elasticsearch77.service._.eu-3.platformsh.site",
"public": false,
"cluster": "rjify4yjcwxaa-master-7rqtwti",
"host": "elasticsearch.internal",
"rel": "elasticsearch",
"query": [],
"path": null,
"password": null,
"type": "elasticsearch:7.7",
"port": 9200,
"host_mapped": false
}
Usage example
In your .platform/services.yaml
:
searchelastic:
type: elasticsearch:7.7
disk: 256
In your .platform.app.yaml
:
relationships:
essearch: "searchelastic:elasticsearch"
Note:
You will need to use
the elasticsearch
type
when defining the service
# .platform/services.yaml
service_name:
type: elasticsearch:version
disk: 256
and the endpoint elasticsearch
when defining the relationship
# .platform.app.yaml
relationships:
relationship_name: “service_name:elasticsearch”
Your service_name
and relationship_name
are defined by you, but we recommend making them distinct from each other.
You can then use the service in a configuration file of your application with something like:
package sh.platform.languages.sample;
import org.elasticsearch.action.admin.indices.refresh.RefreshRequest;
import org.elasticsearch.action.admin.indices.refresh.RefreshResponse;
import org.elasticsearch.action.delete.DeleteRequest;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.SearchHit;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import sh.platform.config.Config;
import sh.platform.config.Elasticsearch;
import java.io.IOException;
import java.util.Arrays;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.function.Supplier;
import static java.util.concurrent.ThreadLocalRandom.current;
public class ElasticsearchSample implements Supplier<String> {
@Override
public String get() {
StringBuilder logger = new StringBuilder();
// Create a new config object to ease reading the Platform.sh environment variables.
// You can alternatively use getenv() yourself.
Config config = new Config();
Elasticsearch elasticsearch = config.getCredential("elasticsearch", Elasticsearch::new);
// Create an Elasticsearch client object.
RestHighLevelClient client = elasticsearch.get();
try {
String index = "animals";
String type = "mammals";
// Index a few document.
final List<String> animals = Arrays.asList("dog", "cat", "monkey", "horse");
for (String animal : animals) {
Map<String, Object> jsonMap = new HashMap<>();
jsonMap.put("name", animal);
jsonMap.put("age", current().nextInt(1, 10));
jsonMap.put("is_cute", current().nextBoolean());
IndexRequest indexRequest = new IndexRequest(index, type)
.id(animal).source(jsonMap);
client.index(indexRequest, RequestOptions.DEFAULT);
}
RefreshRequest refresh = new RefreshRequest(index);
// Force just-added items to be indexed
RefreshResponse refreshResponse = client.indices().refresh(refresh, RequestOptions.DEFAULT);
// Search for documents.
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.query(QueryBuilders.termQuery("name", "dog"));
SearchRequest searchRequest = new SearchRequest();
searchRequest.indices(index);
searchRequest.source(sourceBuilder);
SearchResponse search = client.search(searchRequest, RequestOptions.DEFAULT);
for (SearchHit hit : search.getHits()) {
String id = hit.getId();
final Map<String, Object> source = hit.getSourceAsMap();
logger.append(String.format("result id %s source: %s", id, source)).append('\n');
}
// Delete documents.
for (String animal : animals) {
client.delete(new DeleteRequest(index, type, animal), RequestOptions.DEFAULT);
}
} catch (IOException exp) {
throw new RuntimeException("An error when execute Elasticsearch: " + exp.getMessage());
}
return logger.toString();
}
}
const elasticsearch = require("elasticsearch");
const config = require("platformsh-config").config();
exports.usageExample = async function () {
const credentials = config.credentials("elasticsearch");
const client = new elasticsearch.Client({
host: `${credentials.host}:${credentials.port}`,
});
const index = "my_index";
const type = "People";
// Index a few document.
const names = ["Ada Lovelace", "Alonzo Church", "Barbara Liskov"];
const message = {
refresh: "wait_for",
body: names.flatMap((name) => [
{ index: { _index: index, _type: type } },
{ name },
]),
};
await client.bulk(message);
// Search for documents.
const response = await client.search({
index,
q: "name:Barbara Liskov",
});
const outputRows = response.hits.hits
.map(
({ _id: id, _source: { name } }) =>
`<tr><td>${id}</td><td>${name}</td></tr>\n`
)
.join("\n");
// Clean up after ourselves.
await Promise.allSettled(
response.hits.hits.map(({ _id: id }) =>
client.delete({
index: index,
type: type,
id,
})
)
);
return `
<table>
<thead>
<tr>
<th>ID</th><th>Name</th>
</tr>
</thhead>
<tbody>
${outputRows}
</tbody>
</table>
`;
};
<?php
declare(strict_types=1);
use Elasticsearch\ClientBuilder;
use Platformsh\ConfigReader\Config;
// Create a new config object to ease reading the Platform.sh environment variables.
// You can alternatively use getenv() yourself.
$config = new Config();
// Get the credentials to connect to the Elasticsearch service.
$credentials = $config->credentials('elasticsearch');
try {
// The Elasticsearch library lets you connect to multiple hosts.
// On Platform.sh Standard there is only a single host so just
// register that.
$hosts = [
[
'scheme' => $credentials['scheme'],
'host' => $credentials['host'],
'port' => $credentials['port'],
]
];
// Create an Elasticsearch client object.
$builder = ClientBuilder::create();
$builder->setHosts($hosts);
$client = $builder->build();
$index = 'my_index';
$type = 'People';
// Index a few document.
$params = [
'index' => $index,
'type' => $type,
];
$names = ['Ada Lovelace', 'Alonzo Church', 'Barbara Liskov'];
foreach ($names as $name) {
$params['body']['name'] = $name;
$client->index($params);
}
// Force just-added items to be indexed.
$client->indices()->refresh(array('index' => $index));
// Search for documents.
$result = $client->search([
'index' => $index,
'type' => $type,
'body' => [
'query' => [
'match' => [
'name' => 'Barbara Liskov',
],
],
],
]);
if (isset($result['hits']['hits'])) {
print <<<TABLE
<table>
<thead>
<tr><th>ID</th><th>Name</th></tr>
</thead>
<tbody>
TABLE;
foreach ($result['hits']['hits'] as $record) {
printf("<tr><td>%s</td><td>%s</td></tr>\n", $record['_id'], $record['_source']['name']);
}
print "</tbody>\n</table>\n";
}
// Delete documents.
$params = [
'index' => $index,
'type' => $type,
];
$ids = array_map(function($row) {
return $row['_id'];
}, $result['hits']['hits']);
foreach ($ids as $id) {
$params['id'] = $id;
$client->delete($params);
}
} catch (Exception $e) {
print $e->getMessage();
}
import elasticsearch
from platformshconfig import Config
def usage_example():
# Create a new Config object to ease reading the Platform.sh environment variables.
# You can alternatively use os.environ yourself.
config = Config()
# Get the credentials to connect to the Elasticsearch service.
credentials = config.credentials('elasticsearch')
try:
# The Elasticsearch library lets you connect to multiple hosts.
# On Platform.sh Standard there is only a single host so just register that.
hosts = {
"scheme": credentials['scheme'],
"host": credentials['host'],
"port": credentials['port']
}
# Create an Elasticsearch client object.
client = elasticsearch.Elasticsearch([hosts])
# Index a few documents
es_index = 'my_index'
es_type = 'People'
params = {
"index": es_index,
"type": es_type,
"body": {"name": ''}
}
names = ['Ada Lovelace', 'Alonzo Church', 'Barbara Liskov']
ids = {}
for name in names:
params['body']['name'] = name
ids[name] = client.index(index=params["index"], doc_type=params["type"], body=params['body'])
# Force just-added items to be indexed.
client.indices.refresh(index=es_index)
# Search for documents.
result = client.search(index=es_index, body={
'query': {
'match': {
'name': 'Barbara Liskov'
}
}
})
table = '''<table>
<thead>
<tr><th>ID</th><th>Name</th></tr>
</thead>
<tbody>'''
if result['hits']['hits']:
for record in result['hits']['hits']:
table += '''<tr><td>{0}</td><td>{1}</td><tr>\n'''.format(record['_id'], record['_source']['name'])
table += '''</tbody>\n</table>\n'''
# Delete documents.
params = {
"index": es_index,
"type": es_type,
}
for name in names:
client.delete(index=params['index'], doc_type=params['type'], id=ids[name]['_id'])
return table
except Exception as e:
return e
Note:
When you create an index on Elasticsearch, you should not specify number_of_shards
and number_of_replicas
settings in your Elasticsearch API call. These values will be set automatically based on available resources.
Authentication
By default, Elasticsearch has no authentication. No username or password is required to connect to it.
Starting with Elasticsearch 7.2 you may optionally enable HTTP Basic authentication. To do so, include the following in your services.yaml
configuration:
search:
type: elasticsearch:7.2
disk: 2048
configuration:
authentication:
enabled: true
That will enable mandatory HTTP Basic auth on all requests. The credentials will be available in any relationships that point at that service, in the username
and password
properties, respectively.
This functionality is generally not required if Elasticsearch is not exposed on its own public HTTP route. However, certain applications may require it, or it allows you to safely expose Elasticsearch directly to the web. To do so, add a route to routes.yaml
that has search:elasticsearch
as its upstream (where search
is whatever you named the service in services.yaml
). For example:
"https://es.{default}":
type: upstream
upstream: search:elasticsearch
Plugins
The Elasticsearch 2.4 and later services offer a number of plugins. To enable them, list them under the configuration.plugins
key in your services.yaml
file, like so:
search:
type: "elasticsearch:7.2"
disk: 1024
configuration:
plugins:
- analysis-icu
- lang-python
In this example you’d have the ICU analysis plugin and Python script support plugin.
If there is a publicly available plugin you need that is not listed here, please contact our support team.
Available plugins
This is the complete list of official Elasticsearch plugins that can be enabled:
Plugin | Description | 2.4 | 5.2 | 5.4 | 6.5 | 7.2 |
---|---|---|---|---|---|---|
analysis-icu | Support ICU Unicode text analysis | * | * | * | * | * |
analysis-nori | Integrates Lucene nori analysis module into Elasticsearch | * | * | |||
analysis-kuromoji | Japanese language support | * | * | * | * | * |
analysis-smartcn | Smart Chinese Analysis Plugins | * | * | * | * | * |
analysis-stempel | Stempel Polish Analysis Plugin | * | * | * | * | * |
analysis-phonetic | Phonetic analysis | * | * | * | * | * |
analysis-ukrainian | Ukrainian language support | * | * | * | * | |
cloud-aws | AWS Cloud plugin, allows storing indices on AWS S3 | * | ||||
delete-by-query | Support for deleting documents matching a given query | * | ||||
discovery-multicast | Ability to form a cluster using TCP/IP multicast messages | * | ||||
ingest-attachment | Extract file attachments in common formats (such as PPT, XLS, and PDF) | * | * | * | * | |
ingest-user-agent | Extracts details from the user agent string a browser sends with its web requests | * | * | * | ||
lang-javascript | Javascript language plugin, allows the use of Javascript in Elasticsearch scripts | * | * | |||
lang-python | Python language plugin, allows the use of Python in Elasticsearch scripts | * | * | * | ||
mapper-annotated-text | Adds support for text fields with markup used to inject annotation tokens into the index | * | * | |||
mapper-attachments | Mapper attachments plugin for indexing common file types | * | * | * | ||
mapper-murmur3 | Murmur3 mapper plugin for computing hashes at index-time | * | * | * | * | * |
mapper-size | Size mapper plugin, enables the _size meta field |
* | * | * | * | * |
repository-s3 | Support for using S3 as a repository for Snapshot/Restore | * | * | * | * |
Plugins removal
Removing plugins previously added in your services.yaml
file will not automatically uninstall them from your Elasticsearch instances. This is deliberate, as removing a plugin may result in data loss or corruption of existing data that relied on that plugin. Removing a plugin will usually require reindexing.
If you wish to permanently remove a previously-enabled plugin, you will need to follow the “Upgrading” procedure below to create a new instance of Elasticsearch and migrate to it. In most cases that is not necessary, however, as an unused plugin has no appreciable impact on the server.
Upgrading
The Elasticsearch data format sometimes changes between versions in incompatible ways. Elasticsearch does not include a data upgrade mechanism as it is expected that all indexes can be regenerated from stable data if needed. To upgrade (or downgrade) Elasticsearch you will need to use a new service from scratch.
There are two ways of doing that.
Destructive
In your services.yaml
file, change the version of your Elasticsearch service and its name. Then update the name in the .platform.app.yaml
relationships block.
When you push that to Platform.sh, the old service will be deleted and a new one with the name name created, with no data. You can then have your application reindex data as appropriate.
This approach is simple but has the downside of temporarily having an empty Elasticsearch instance, which your application may or may not handle gracefully, and needing to rebuild your index afterward. Depending on the size of your data that could take a while.
Transitional
For a transitional approach you will temporarily have two Elasticsearch services. Add a second Elasticsearch service with the new version a new name and give it a new relationship in .platform.app.yaml
. You can optionally run in that configuration for a while to allow your application to populate indexes in the new service as well.
Once you’re ready to cut over, remove the old Elasticsearch service and relationship. You may optionally have the new Elasticsearch service use the old relationship name if that’s easier for your application to handle. Your application is now using the new Elasticsearch service.
This approach has the benefit of never being without a working Elasticsearch instance. On the downside, it requires two running Elasticsearch servers temporarily, each of which will consume resources and need adequate disk space. Depending on the size of your data that may be a lot of disk space.