Introduction #
For a long time, the “fire and forget” nature of PHP scripts meant that memory management was rarely a top priority for developers. A script would run, render HTML, and die—taking all its allocated memory with it.
However, the landscape of PHP development has shifted dramatically by 2025. With the rise of persistent application servers like Swoole, RoadRunner, FrankenPHP, and frameworks utilizing Laravel Octane, PHP processes are now running for days or weeks at a time. In these environments, a small memory leak isn’t just a nuisance; it’s a critical failure that can crash your production server.
Even in traditional FPM setups, data-intensive tasks (like Excel exports, image processing, or batch ETL jobs) require a surgical understanding of how PHP handles RAM.
In this deep dive, we are going to peel back the layers of the Zend Engine. We will explore how PHP variables are stored, how the Copy-on-Write (COW) mechanism saves memory, how the Garbage Collector (GC) deals with circular references, and how you can profile and optimize your code for maximum efficiency.
Prerequisites & Environment #
To follow along with the code examples and benchmarks in this article, you should have the following setup:
- PHP 8.2 or 8.3: We will use modern syntax and types.
- CLI Access: Most memory experiments are best run from the command line.
- Composer: For autoloading (if you expand the examples).
- Xdebug (Optional but Recommended): For deep profiling.
We will not be using any external frameworks for the core logic to keep the focus strictly on the engine’s behavior.
Part 1: Under the Hood - Zvals and Reference Counting #
To understand memory management, you must understand the zval (Zend Value).
When you create a variable in PHP $a = "Hello";, PHP doesn’t just store the string. It creates an internal container called a zval. This container holds:
- The type of the variable (string, integer, array, etc.).
- The value (or a pointer to the value).
- A refcount (Reference Count).
- A flag indicating if it’s a reference (
&).
The Concept of Reference Counting #
PHP manages memory primarily through Reference Counting. This is a simple yet effective mechanism:
- Every time a variable points to a value, the
refcountincreases. - Every time a variable is unset or goes out of scope, the
refcountdecreases. - When
refcounthits 0, the memory is freed immediately.
Let’s visualize this flow:
Copy-on-Write (COW) #
One of PHP’s smartest optimizations is Copy-on-Write. If you assign $b = $a, PHP does not duplicate the data in memory immediately. Instead, $b points to the same internal data structure as $a, and the refcount is incremented.
Memory is only duplicated if you modify one of the variables.
Practical Demonstration: COW in Action #
Let’s verify this behavior using memory_get_usage().
<?php
/**
* Helper function to format memory usage
*/
function formatBytes(int $bytes): string {
return round($bytes / 1024 / 1024, 2) . ' MB';
}
echo "Start: " . formatBytes(memory_get_usage()) . "\n";
// 1. Create a large array (approx 20MB)
$array = range(1, 1000000);
echo "After Array Creation: " . formatBytes(memory_get_usage()) . "\n";
// 2. Assign to a new variable
// EXPECTATION: Memory should NOT double here due to COW.
$copy = $array;
echo "After Assignment (\$copy = \$array): " . formatBytes(memory_get_usage()) . "\n";
// 3. Modify the copy
// EXPECTATION: Memory SHOULD increase now as COW triggers duplication.
$copy[0] = 'Trigger COW';
echo "After Modification (\$copy[0] = ...): " . formatBytes(memory_get_usage()) . "\n";Output Analysis:
If you run this script, you will notice that step 2 barely increases memory usage (only a tiny overhead for the zval structure). Step 3 is where the heavy lifting happens, doubling the memory usage for the array data.
Pro Tip: This is why passing large arrays to functions in PHP is generally cheap, unless that function modifies the array. If the function modifies the array, passing by reference (&$array) might be more memory-efficient, though it comes with its own code-smell risks.
Part 2: The Villain - Circular References #
If Reference Counting is so great, why do we need a Garbage Collector?
The Achilles’ heel of refcounting is Circular References. This happens when a complex structure (like an Object or an Array) references itself. Even if you unset() the variable, the internal data structure still points to itself, keeping the refcount at 1 (or higher). The memory is never freed, creating a memory leak.
Creating a Memory Leak #
Let’s intentionally leak memory to see the problem.
<?php
class Node {
public $child;
public $name;
public function __construct($name) {
$this->name = $name;
}
}
echo "Initial Memory: " . number_format(memory_get_usage()) . " bytes\n";
for ($i = 0; $i < 10000; $i++) {
$a = new Node("Node A");
$b = new Node("Node B");
// Create a circular reference
$a->child = $b;
$b->child = $a;
// We unset the variables in the current scope
// BUT, they still reference each other internally.
unset($a, $b);
}
echo "Final Memory (With Leak): " . number_format(memory_get_usage()) . " bytes\n";
// Force GC to clean up
$cleaned = gc_collect_cycles();
echo "GC Collected: $cleaned cycles\n";
echo "Memory After GC: " . number_format(memory_get_usage()) . " bytes\n";What happened here? #
- Instantiation: We created
$aand$b. Refcount for both is 1. - Linking:
$a->childpoints to$b.$brefcount is 2.$b->childpoints to$a.$arefcount is 2. - Unsetting: When we
unset($a, $b), the variables in the local scope are removed. The refcounts drop from 2 to 1. - The Trap: Since refcount is 1 (not 0), PHP assumes they are still in use. But they are unreachable!
This is where the Garbage Collector (GC) steps in.
Part 3: The Garbage Collector (GC) #
PHP’s GC (enabled by default) acts as a safety net for these circular references. It doesn’t run on every variable assignment; that would be too slow. Instead, it uses a Root Buffer.
How the GC Cycle Works #
- Suspects: When a refcount decreases but doesn’t hit 0, PHP marks this zval as a “possible garbage root” and adds it to the root buffer.
- Buffer Limit: By default, this buffer holds 10,000 roots.
- Trigger: When the buffer is full, the GC mechanism kicks in.
- Mark and Sweep:
- Simulation: The GC traverses the roots and simulates decreasing refcounts for all nested children by 1.
- Check: If a variable’s refcount drops to 0 during simulation, it means it’s only referenced by itself (or the cycle). It is garbage.
- Cleanup: PHP frees these variables and clears the buffer.
Visualizing the GC Decision Matrix #
Configuring the GC #
You can control the GC via php.ini or runtime functions:
| Function / Setting | Description | Use Case |
|---|---|---|
zend.enable_gc |
php.ini setting to enable/disable GC. |
Keep enabled unless you have a specific low-latency, short-script requirement. |
gc_enable() |
Activates the circular reference collector. | Use if GC was disabled at runtime. |
gc_disable() |
Deactivates the collector. | Use during high-performance loops where you know no cycles exist to save CPU cycles. |
gc_collect_cycles() |
Forces the collection cycle immediately. | Call manually after a large batch job to free memory immediately. |
gc_status() |
Returns info about GC status. | Useful for debugging/monitoring. |
Part 4: WeakReferences - The Modern Solution #
Introduced in PHP 7.4, WeakReference allows you to hold a reference to an object that does not prevent the object from being destroyed. This is incredibly useful for caching mechanisms.
If you build a cache array holding objects, that array usually keeps the objects alive forever. With WeakReferences, if the object is unset elsewhere in the app, it vanishes from your cache automatically.
<?php
$cache = [];
$obj = new stdClass();
$obj->id = 1;
// Traditional assignment: Strong Reference
// $cache['obj'] = $obj;
// Modern assignment: Weak Reference
$cache['weak'] = WeakReference::create($obj);
echo "Object exists? " . ($cache['weak']->get() ? 'Yes' : 'No') . "\n";
// Unset the main reference
unset($obj);
// Check the cache again
// $cache['weak']->get() will return NULL because the object was destroyed
echo "Object exists after unset? " . ($cache['weak']->get() ? 'Yes' : 'No') . "\n";Why use this?
In long-running workers (like a WebSocket server), you might associate connection objects with user data. Using WeakReference ensures that if a connection closes and the main object is destroyed, your “lookup map” doesn’t leak memory by holding onto a zombie object.
Part 5: Profiling Memory Leaks #
Assuming you can just read code and find leaks is a mistake. You need tools.
1. Xdebug Profiler #
Xdebug is the standard for PHP debugging. While mostly used for step-debugging, its profiling capabilities generate “Cachegrind” files that can visualize memory spikes.
Configure Xdebug in php.ini:
xdebug.mode=profile
xdebug.start_with_request=trigger
xdebug.output_dir=/tmp/snapshotsAnalyze the output using tools like QCacheGrind (Windows/Linux) or KCacheGrind. Look for functions where “Memory Self” is high and doesn’t drop.
2. php-memprof #
For a more specialized tool, php-memprof is excellent. It can generate Valgrind-compatible callgrind files specifically tracking memory allocation.
Usage Example:
php -d extension=memprof.so script.phpIt can dump the heap to a file, showing exactly which function allocated the memory that wasn’t freed.
3. Runtime Logging #
In production, you can’t run Xdebug. Instead, sprinkle your long-running scripts with logging:
// In your main loop
$logger->info("Current Memory: " . memory_get_usage(true));
$logger->info("Peak Memory: " . memory_get_peak_usage(true));Note: Passing true to memory_get_usage tells PHP to return the real memory allocated from the system (including overhead), not just the memory used by the script’s variables.
Part 6: Best Practices for 2025 and Beyond #
If you are building high-performance PHP applications, follow these golden rules:
1. Unset Large Variables Immediately #
Don’t wait for variables to go out of scope if you are done with a large dataset (e.g., a processed CSV file). unset($largeArray) helps keep the memory footprint low.
2. Use Generators for Large Datasets #
Instead of loading a million DB rows into an array, use Generators (yield). This processes one row at a time, keeping memory usage constant regardless of dataset size.
Bad:
function getAllRows() {
return $db->query("SELECT * FROM huge_table")->fetchAll();
}
// Loads 1GB into RAM
Good:
function getRowsGenerator() {
$stmt = $db->query("SELECT * FROM huge_table");
while ($row = $stmt->fetch()) {
yield $row;
}
}
// Uses few KB of RAM
3. Disable GC During Heavy Processing #
If you are iterating millions of times and creating many temporary objects that you know do not have circular references, the GC mechanism checking the roots slows you down.
gc_disable();
// ... intense processing loop ...
gc_enable();
gc_collect_cycles();This can yield a 5-10% performance boost in tight loops.
4. Beware of Closures #
Closures (Anonymous functions) bind variables from the parent scope. This often creates implicit circular references if the closure is stored inside an object that it also references.
The Fix: Use $this carefully inside closures, or use WeakMap (PHP 8.0+) to associate data with objects without causing leaks.
Conclusion #
Memory management in PHP is a blend of automatic convenience and manual responsibility. While the engine handles 95% of the work through Reference Counting and Copy-on-Write, that remaining 5%—circular references and large dataset handling—is what separates a junior developer from a senior engineer.
As we move into 2026, with PHP increasingly used for daemons, WebSocket servers, and microservices, these skills are non-negotiable.
Key Takeaways:
- Trust Refcounting, but understand its limits.
- Circular References are the enemy; use the GC to fight them.
- WeakReferences are your friend for caching.
- Generators are superior to Arrays for iteration.
- Always profile before you optimize.
Start auditing your long-running scripts today. A few unset() calls and a WeakReference implementation might just save your server from its next OOM (Out of Memory) crash.
Further Reading: