samedi 9 mai 2015

Getting source of the websites using file_get_content

I have a list of a couple of thousands websites. I have to iterate over them, and in each iteration - call file_get_contents of the given url, search for some information from the source using regex, and write it to another file.

Ok, the thing is - it's very, very slow. I divided the whole process into searching for about 50 urls each time I refresh the page. But:

  • I'd have to refresh the page until I get to a couple of thousand
  • even with only 50 urls, I get 30sec time exceeded

Is there a way to speed this up?

Aucun commentaire:

Enregistrer un commentaire