Let’s be honest: in 2025, users have zero patience for slow search bars. If your application takes three seconds to return a result—or worse, returns irrelevant results because of a typo—you are losing engagement.
For years, many PHP developers relied on the classic WHERE title LIKE '%keyword%' SQL query. While that works for a blog with 50 posts, it brings database performance to its knees once you hit a few thousand records. It also lacks features users take for granted, like fuzzy matching (handling typos), relevance scoring, and faceting.
This is where Elasticsearch comes in.
In this guide, we are going to build a production-grade search functionality using PHP 8.x and Elasticsearch 8. We aren’t just going to dump code; we are going to look at the architecture, the “gotchas” regarding index mapping, and how to keep your data in sync.
Why Elasticsearch? #
Before we open our IDEs, let’s understand the tool. Elasticsearch is a distributed, RESTful search and analytics engine based on Apache Lucene. Unlike a relational database (MySQL/PostgreSQL) that excels at data integrity and relationships, Elasticsearch excels at inverted indices.
Think of the index at the back of a textbook. If you want to find “PHP,” you don’t read every page (a Table Scan); you look at the index, see which pages contain “PHP,” and go there immediately. That is essentially how Elasticsearch operates, but at a massive scale.
Prerequisites & Environment #
To follow this tutorial, you will need:
- PHP 8.2+: We will use modern features like typed properties and constructor promotion.
- Composer: For dependency management.
- Docker & Docker Compose: To run the Elasticsearch instance without messing up your local machine.
The Architecture #
Here is how our PHP application will interact with the search engine. Notice that we don’t usually replace the primary database (MySQL) with Elasticsearch. Instead, we sync data to Elasticsearch specifically for reading/searching.
Step 1: Setting Up the Infrastructure #
First, let’s get an Elasticsearch instance running. We will also include Kibana, a web interface that lets us inspect our data easily.
Create a docker-compose.yml file in your project root:
version: '3.8'
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:8.11.1
container_name: php_es_node
environment:
- discovery.type=single-node
- xpack.security.enabled=false
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
ports:
- "9200:9200"
networks:
- es-net
kibana:
image: docker.elastic.co/kibana/kibana:8.11.1
container_name: php_kibana
environment:
- ELASTICSEARCH_HOSTS=http://elasticsearch:9200
ports:
- "5601:5601"
depends_on:
- elasticsearch
networks:
- es-net
networks:
es-net:
driver: bridgeNote: For this tutorial, we disabled security (xpack.security.enabled=false) to simplify the connection. In production, you must enable SSL and authentication.
Run the stack:
docker-compose up -dVerify it is running by visiting http://localhost:9200. You should see a JSON response with the tagline “You Know, for Search”.
Step 2: Installing the PHP Client #
We need the official client provided by Elastic. It handles connection pooling, retries, and serialization for us.
composer require elasticsearch/elasticsearchStep 3: Configuring the Client #
Let’s create a dedicated SearchClient class to manage our connection. This is better than instantiating the library directly in your controllers because it allows you to inject configuration and handle logging centrally.
<?php
declare(strict_types=1);
namespace App\Search;
use Elastic\Elasticsearch\Client;
use Elastic\Elasticsearch\ClientBuilder;
class SearchClientFactory
{
public static function create(): Client
{
// In a real app, pull these from .env
$hosts = ['http://localhost:9200'];
return ClientBuilder::create()
->setHosts($hosts)
->setRetries(2) // Retry twice on connection failure
->build();
}
}Step 4: Indexing Data (The “Write” Side) #
Before we can search, we need data. In Elasticsearch, data is stored in Indices (similar to tables in SQL).
Unlike SQL, you can dump data without a schema, and Elasticsearch will guess the types. Don’t do this. Dynamic mapping is a performance killer and leads to weird bugs. Always define your mapping explicitly.
Let’s create a service to manage our products index.
<?php
declare(strict_types=1);
namespace App\Search;
use Elastic\Elasticsearch\Client;
class ProductIndexer
{
private Client $client;
private string $indexName = 'products_v1';
public function __construct(Client $client)
{
$this->client = $client;
}
public function createIndex(): void
{
$params = [
'index' => $this->indexName,
'body' => [
'settings' => [
'number_of_shards' => 1,
'number_of_replicas' => 0,
'analysis' => [
'analyzer' => [
// Custom analyzer for better partial matching
'default' => [
'type' => 'standard'
]
]
]
],
'mappings' => [
'properties' => [
'id' => ['type' => 'integer'],
'title' => ['type' => 'text', 'analyzer' => 'standard'],
'description' => ['type' => 'text', 'analyzer' => 'standard'],
'price' => ['type' => 'float'],
'category' => ['type' => 'keyword'], // Keyword is for exact filtering
'created_at' => ['type' => 'date']
]
]
]
];
// Delete if exists (for development only!)
if ($this->client->indices()->exists(['index' => $this->indexName])->asBool()) {
$this->client->indices()->delete(['index' => $this->indexName]);
}
$this->client->indices()->create($params);
}
public function indexProduct(array $productData): void
{
$params = [
'index' => $this->indexName,
'id' => $productData['id'], // Sync ID with MySQL ID
'body' => [
'id' => $productData['id'],
'title' => $productData['title'],
'description' => $productData['description'],
'price' => $productData['price'],
'category' => $productData['category'],
'created_at' => date('c'),
]
];
$this->client->index($params);
}
}Usage #
// script_index.php
require 'vendor/autoload.php';
use App\Search\SearchClientFactory;
use App\Search\ProductIndexer;
$client = SearchClientFactory::create();
$indexer = new ProductIndexer($client);
// 1. Setup the schema
$indexer->createIndex();
// 2. Index a dummy product
$indexer->indexProduct([
'id' => 1,
'title' => 'Professional PHP Developer Guide',
'description' => 'A comprehensive guide to modern PHP.',
'price' => 49.99,
'category' => 'books'
]);
echo "Index created and document added!";Step 5: Implementing the Search (The “Read” Side) #
Now for the fun part. We want to find products. We will implement a multi_match query, which searches across multiple fields (e.g., title and description) simultaneously. We will also add fuzziness to handle typos.
<?php
declare(strict_types=1);
namespace App\Search;
use Elastic\Elasticsearch\Client;
class SearchService
{
private Client $client;
private string $indexName = 'products_v1';
public function __construct(Client $client)
{
$this->client = $client;
}
public function search(string $query, ?string $category = null, int $limit = 10): array
{
$searchParams = [
'index' => $this->indexName,
'body' => [
'size' => $limit,
'query' => [
'bool' => [
'must' => [
'multi_match' => [
'query' => $query,
'fields' => ['title^3', 'description'], // Title is 3x more important
'fuzziness' => 'AUTO', // Handles typos like "PPH" instead of "PHP"
]
]
]
]
]
];
// Apply Filtering if category is provided
if ($category) {
$searchParams['body']['query']['bool']['filter'][] = [
'term' => ['category' => $category]
];
}
$response = $this->client->search($searchParams);
return $this->formatResults($response->asArray());
}
private function formatResults(array $response): array
{
$hits = $response['hits']['hits'];
return array_map(function ($hit) {
return [
'id' => $hit['_source']['id'],
'title' => $hit['_source']['title'],
'score' => $hit['_score'], // How relevant the result is
];
}, $hits);
}
}Why use bool queries?
#
The bool query is the heart of Elasticsearch. It combines logic:
- Must: The document must match this (contributes to score).
- Filter: The document must match this (Yes/No binary, does not calculate score, cached, very fast).
Always use filter for exact matches like categories, IDs, or date ranges. Use must for full-text keyword searches.
SQL vs. Elasticsearch: A Quick Comparison #
When should you stick to SQL and when should you add the complexity of Elasticsearch?
| Feature | MySQL LIKE / REGEXP |
Elasticsearch |
|---|---|---|
| Speed (Large Dataset) | Slow (Full Table Scan) | Very Fast (Inverted Index) |
| Typos / Fuzziness | Very Difficult to implement | Built-in (fuzziness: AUTO) |
| Relevance Scoring | No (Boolean only) | Yes (TF/IDF, BM25) |
| Infrastructure | Simple (Already exists) | Moderate (Requires JVM, maintenance) |
| Real-time | Instant | Near Real-time (~1s delay by default) |
Performance Tips and Common Pitfalls #
As you scale your PHP search implementation, keep these critical points in mind:
1. Pagination Deep-Diving #
Standard pagination (from: 10000, size: 10) kills Elasticsearch memory. It has to load all 10,010 results to sort them and return the last 10.
- Solution: For deep pagination, use the
search_afterparameter with a unique sort key (like ID), rather thanfrom.
2. The “Mapping Explosion” #
If you send a JSON object with arbitrary keys to Elasticsearch, it will create a new mapping field for every unique key. If you have thousands of unique keys, the cluster state becomes too large, and the cluster crashes.
- Solution: Set
dynamic: strictin your mapping settings. This forces you to define fields explicitly and throws an error if unknown fields are sent.
3. Bulk Indexing #
Don’t loop through your database and index documents one by one via HTTP. That is network suicide.
- Solution: Use the Bulk API. In PHP, gather 500-1000 documents into an array and send them in a single request.
// Quick example of Bulk structure
$params = ['body' => []];
foreach ($products as $product) {
$params['body'][] = [
'index' => [
'_index' => 'products_v1',
'_id' => $product['id']
]
];
$params['body'][] = $product; // The actual data
}
$client->bulk($params);Conclusion #
Implementing search in PHP has evolved. We moved from simple SQL queries to sophisticated engines capable of understanding user intent. By integrating Elasticsearch, you provide your users with the speed and accuracy they expect in 2025.
Key Takeaways:
- Use Docker to spin up ES locally.
- Always define your Mapping explicitly; avoid dynamic mapping.
- Use the Bool Query to combine full-text search (score) with exact filtering (speed).
- Handle data ingestion efficiently using the Bulk API.
In the next article, we will discuss Aggregations—how to build those sidebar filters (e.g., “Price Range”, “Brands”) that you see on e-commerce sites, calculated dynamically based on the search results.
Ready to deploy? Make sure to secure your Elasticsearch instance with SSL and basic auth before exposing it to the production web!