From the S log analysis of spider crawling time to create the second site timely protection of ori

love spiders in Shanghai just love Shanghai, it automatically access the web crawl web content, and we called the news the thief is a principle, but we welcome the thief. Not always stop at a spider on the site, for a large site, there may be a lot of spiders visit many different web pages, create a website every second of a spider in the activity, but even such a large site, specific to a web page (such as a web site first page), visit the general spider there will be a certain time interval from a few seconds to a few hours, there are a few days, this is the spider crawling interval;

interval, love Shanghai spider crawling

often have original content owners complain about high weight website collected their hard, leading to their original into other sites, today I share through examples and how to solve the problem.

A program of

we can imagine that the intellectual property rights in real life, in the simplest case, if A published an original article in the magazine, see B without change of plagiarism and delivered to other magazines, A sued B, court easily according to the published time to judge B copied from A since A published earlier (if B modified two processing methods that will be published again, according to the court’s identification and the evidence on both sides), back to the network in the world, especially to Shanghai by love rules to determine who is the original system, assume that Shanghai has determined that a love article published in two a different site, who is the original, very simple, who is love who is original and not included in Shanghai who first published, the owners said, my article first published, but after n hours of love The sea was included, and other sites in Shanghai included the acquisition of love before my love and was immediately collected in Shanghai, so I became not original, the problem here,

The law and the

included time!Since

talk about the regularity of specific website (web crawling) according to fixed cycle, such as every few minutes, a few hours of visits, testjar (by Web Explorer analysis and Log data export to excel classification).

is the author of the above statistics website spider crawling rule (originally wanted list within 2 days of total hours of data, the data found too much inconvenience photos published only.

love Shanghai included our web content is slow, how to solve? Let love Shanghai the first time included ", generally there are 2 kinds of methods, one is the use of PING services, PING is immediately you published an article after love Shanghai told it the address on the introduction and use of PING services (please refer to love Shanghai Webmaster Platform, can also contact the author), the general authority for the source of the news website, the website seems to ignore the small love Shanghai, the second method is the focus of the paper – select the appropriate release time.

Leave a Reply

Your email address will not be published. Required fields are marked *