28 C
New York
Tuesday, July 1, 2025

Buy now

This proxy provider I tested is the best for web scraping – and it’s not IPRoyal or MarsProxies

Oxylabs supplies a variety of net scraping and associated companies. These embrace working proxy machines, offering developer APIs for accessing and making requests by means of these machines, and offering supporting companies (together with a scraping-aware AI) for parsing retrieved knowledge with the intention to use it in purposes.

Monumental proxy swimming pools

In comparison with different proxy companies like IPRoyal or MarsProxies, Oxylabs provides a a lot bigger pool of residential proxy machines. MarsProxies experiences simply one million machines in its proxy pool, IPRoyal experiences having a pool of 32 million residential machines, whereas Oxylabs provides greater than 175 million residential proxies throughout 195 international locations.

Relating to uninterrupted scraping operations, the extra obtainable machines, the much less any one machine might be flagged as intrusive by web site operators. This each reduces the load on the websites, and will increase the probability that scraping operations will succeed.

One factor caught in my thoughts reviewing this info: How, precisely, does an organization like Oxylabs achieve entry to 175 million machines, particularly since they are saying they accomplish that ethically? Oxylabs supplies a must-read report that discusses their procurement processes and insurance policies.

It seems that the corporate pays residential machine house owners a small quantity in return to be used of a slice of their bandwidth. That is all facilitated by numerous completely different apps that supply customers monetary reward for collaborating in these applications.

I’ve ran into the promotion of those apps earlier than, however I did not understand their raison d’être: to supply entry to distributed machines for knowledge acquisition networks. No particular person pc person goes to get wealthy off of those participation applications, however if you happen to’re somebody who makes use of bandwidth sparingly, it may be a strategy to decide up a number of additional bucks.

See also  How Synthflow AI is cutting through the noise in a loud AI voice category

Along with residential proxies, the corporate provides ISP proxies (which use residential IPs however are hosted in an ISP’s knowledge middle for extra stability), cell proxies (which run on and report back to websites as cell gadgets, for mobile-specific testing), knowledge middle proxies (for rock-solid efficiency, however much less anonymity), and devoted knowledge middle proxies (which provide you with limitless bandwidth and devoted IP for high-performance work).

Testing out the coding interface

The parents at Oxylabs gave me entry to their coding interface, so I used to be capable of get a really feel for what it takes to make use of their proxies, make knowledge requests, and parse the information for utility use.

The corporate will get kudos for a way they supply utilization info. They’ve a really useful YouTube channel with 425 movies. I solely had time to observe a fraction of them, however they’re clear, to the purpose, and really informative.

The corporate has an easy-to-understand dashboard, which is the place to begin for all operations.

Additionally they provide a testing platform, referred to as the API playground. It is right here you can paste in code segments and see how they carry out. Be aware that the corporate provides pre-written code blocks for CURL, Python, PHP, C#, Go, Java, Node.js, and JSON. That is a plus, as a result of many API distributors do not do that. I at all times really feel extra comfy after I can see code examples within the programming surroundings I am utilizing.

Issues acquired actually fascinating after I began to tinker with the Oxy AI, referred to as OxyCopilot. First, I am recommending that Oxy change the AI’s title since Copilot is Microsoft’s time period and there is prone to be pushback from Redmond’s trademark enforcement crew.

See also  Gemini 2.5 Pro is Google’s most expensive AI model yet

That mentioned, OxyCopilot is cool. One of many tougher elements of net scraping operations is that when you get the information again, you must determine how one can extract usable info. Because you’re actually getting again a whole HTML web page (crammed with adverts, HTML tags, and a ton of unrelated info), that post-processing course of is algorithmically non-trivial.

On the left is scraping knowledge that Oxylabs pulled again in throughout a take a look at scrape of their playground. On the appropriate is the product I used to be scraping, my favourite tech product of all time. The one odd factor is that I gave OxyCopilot the URL to an English-language web page and the preview it is displaying is in Spanish, though the pricing info is similar.

Discover how difficult the uncooked returned knowledge is. However then I did the identical operation utilizing OxyCopilot. I began by giving it a URL to scrape.

Then, I skipped previous the scraper parameters to offer the AI some parsing directions. All I requested was, “Please extract present product title and worth. Point out if the value is a reduced worth or the common worth.”

The result’s this fascinating kind. Be aware that it did pull the pricing knowledge appropriately. It introduced the information to me as a JSON block. However the fascinating bit is the Parsing Directions tab on the far proper.

What the AI has carried out is create a JSON construction that you simply feed into the Oxylabs API when sending a scraping request. The API will observe the directions embedded in that JSON construction, and provide you with again simply the information you requested.

See also  Multimodal AI poses new safety risks, creates CSEM and weapons info

I’ve carried out net web page parsing many instances earlier than, and it’s a very time-consuming, tedious job. This took me much less that 5 minutes.

ZDNET’s shopping for recommendation

So, must you use this service? Take into account that choices at this stage are enterprise and operational selections. From the perspective of moral sourcing, Oxylabs looks as if a good selection. And, judging from my restricted testing, it is also a good selection from a programming and algorithmic perspective.

As for whether or not it is cost-effective, that relies upon totally in your use case. Solely you and your crew can resolve that.

Lastly, in relation to documentation and coaching supplies, Oxylabs is first-rate. I used to be very impressed with the general content material on their web site and on YouTube. It introduced me on top of things in a short time.

What about you? Have you ever used proxy or net scraping companies like Oxylabs in your work or analysis? What challenges have you ever confronted with knowledge assortment at scale, and the way did you navigate moral or technical roadblocks? Have you ever tried integrating AI instruments like OxyCopilot to streamline your scraping workflows? Tell us within the feedback beneath.


You may observe my day-to-day undertaking updates on social media. Make sure to subscribe to my weekly replace publication, and observe me on Twitter/X at @DavidGewirtz, on Fb at Fb.com/DavidGewirtz, on Instagram at Instagram.com/DavidGewirtz, on Bluesky at @DavidGewirtz.com, and on YouTube at YouTube.com/DavidGewirtzTV.

Supply hyperlink

Related Articles

Leave a Reply

Please enter your comment!
Please enter your name here

Latest Articles