Browser
Guide for browser instance management
In SERPS the browser
is an element that helps to mimic the behavior of a browser.
Be aware that the name browser stand for the fact that this entity is capable of emitting http request as a real browser would do, taking care of cookies, proxies, language, etc... but in any case this class is able to render a html page or to evaluate css or javascript.
Browser
are available since version 0.2
of serps/core
.
Create a browser
A minimal browser only requires a http client to send request through:
use Serps\Core\Browser\Browser;
use Serps\HttpClient\CurlClient;
$browser = new Browser(new CurlClient());
Setting user agent
The next step in configuring the browser is to specify what user agent we want to use:
use Serps\Core\Browser\Browser;
use Serps\HttpClient\CurlClient;
$userAgent = 'some user agent';
$browser = new Browser(new CurlClient(), $userAgent);
// Or set it latter:
$browser->setAcceptLanguage('fr-CA');
When setting a user agent you will prefer using a real user agent string. Here are a few user agent lists:
Setting the language
The browser is responsible for headers management. The accept-language header is one of the most important and thus you should specify it when creating a new browser:
use Serps\Core\Browser\Browser;
use Serps\HttpClient\CurlClient;
$language = 'fr-FR';
$browser = new Browser(new CurlClient(), $userAgent, $language);
// Or set it latter:
$browser->setUserAgent('other UA');
If you dont specify a language or if you set it to null
the default value en-US,en;q=0.8
will be used.
Using a cookie jar
Warning
Cookies usage is still at prototype stage and all http engines do not support cookies yet.
As real browser the browser class has the ability to use a cookie jar:
use Serps\Core\Browser\Browser;
use Serps\HttpClient\CurlClient;
$cookieJar = ....;
$browser = new Browser(new CurlClient(), $userAgent, $language, $cookieJar);
// Or set it latter:
$browser->setCookieJar($otherCookieJar);
// You can also remove it to disable cookies
$browser->setCookieJar(null);
By default no cookie jar will be used and requests will be free of cookies.
To learn more on how to create cookie jar, please check the cookie documentation
Using a proxy
It's possible to give a proxy for the browser. The browser will use this proxy for every requests.
use Serps\Core\Browser\Browser;
use Serps\HttpClient\CurlClient;
$proxy = ....;
$browser = new Browser(new CurlClient(), $userAgent, $language, $cookieJar, $proxy);
// Or set it latter:
$browser->setProxy($otherProxy);
// You can also remove it to disable proxy
$browser->setProxy(null);
By default no proxy is used.
To learn more on how to create proxies, please check the proxy documentation
Setting default headers
The browser instance is able to add default headers for every requests sent to the browser.
That might be helpful, for instance, when you want to set a custom referrer to the http requests you send to a server.
$browser->setDefaultHeader('Referer', 'my custom referrer');
You are also able to check if a default header is configured for the browser or to get the value for the header:
$browser->hasDefaultHeader('Referer'); // = true
$browser->getDefaultHeaderValue('Referer'); // = "my custom referrer"
$browser->hasDefaultHeader('foo'); // = false
$browser->getDefaultHeaderValue('foo'); // = null
// note that hasDefaultHeader and getDefaultHeaderValue are case insensitive
Issue http requests
You can get response from a given url:
use Serps\Core\Browser\Browser;
use Serps\HttpClient\CurlClient;
use Serps\Core\Url;
$browser = new Browser(new CurlClient());
$response = $browser->navigateToUrl(Url::fromString('https://example.com'));
$statusCode = $response->getHttpResponseStatus();
$pageContent = $response->getPageContent();
Or use PSR-7 requests:
$response = $browser->sendRequest($myPsr7Request);
$statusCode = $response->getHttpResponseStatus();
$pageContent = $response->getPageContent();