📄️ Basic Request
To scrape a web page using the scrape-it.cloud, simply use the API’s base endpoint and append the URL you would like to scrape as a body parameter and your API key as a header.
📄️ Response Format
scrape-it.cloud web scraping API endpoint returns a JSON object with status and scrapingResult properties.
📄️ Custom Headers
If you would like to use your own custom headers (user agents, cookies, etc.) when making a request to the website, simply set them in headers parameter in JSON body. The API will then use these headers when sending requests to the website.
📄️ Custom Cookies
User cookies can be passed to the target web pages. scrape-it.cloud hadles custom cookie name and value. To set multiple cookies, separate them with ;.
To get a screenshot of the scraped page, use the "screenshot": "true" parameter.
📄️ Extraction Rules
scrape-it.cloud also provides the ability to use extraction rules, which will allow you to get only the target data on JSON format without the need for HTML parsing on your side.
📄️ Wait for CSS selector
Generally, our system waits until the entire page load from the networking perspective. It means that the result will be returned right after the networking is performed.
📄️ Wait For A Fixed Amount Of Time
Sites with the large source code may need a delay to be fully "rendered". To have Scrape-it.cloud wait some time before sending HTML, use the wait option. Values can range from 0 to 35000 milliseconds.
📄️ Blocking Images And CSS
By default scrape-it.cloud do not block images and CSS in the scraped page. To speed up requests and block images and CSS set block_resources parameter value to true.
📄️ Blocking URLs
If you want to block some resources except images and CSS, for example, analytics scripts, you can add a part of the urls to be blocked.