Is it Worth It Anymore for a Beginner

Guard Dog

Guard Dog
Staff member
Joined
Dec 13, 2006
Messages
11,353
Reaction score
3,179
Yes. I know they do. It is easy to code that, to be honest. I use .NET for my Terms Spider... the one that detects terms changes on a daily basis. I sometimes use googlebot myself as I have been purposefully blocked by programs that do not want their terms changes seen. My guess is that is VERY common - changing the useragent. And it is one line of code in .NET. Easy.
 

falseadoom

Affiliate Guard Dog Member
Joined
May 8, 2014
Messages
359
Reaction score
106
people think they are not scraped but most are.
They do not just scrape first sentence anymore they go down in page usually take 5 to 6 word combinations now. Some article they will end up scraping 90% of it a few words at a time posting on many different places.
So if you outrank contents you think all good, but your just one of tons with same content and you have less value.
So these scraper sites not ranking, you think no effects. Well it is not all about the scraper sites, they have normal sites to that they want to rank first page so making your sites go down in serps there main sites go up.

They also build bad links to your sites to try and get them nailed or demoted. Seen a lot of that.
 

AussieDave

24 years & still going!
Joined
Nov 28, 2013
Messages
5,103
Reaction score
3,607
they do use fake googlebot referrers too..

Yes they do. However at my sites, they don't get in, just like all the other crap that's blocked at the front gate ;)

I used to think my sites got a fair bit of traffic, that was before I blocked all the crap hitting them on daily basis. Not only did I save a heap of bandwidth but now, those unique hits and page views are 100% legit!

More example of the crap affiliates allow in their sites each day, IF your not blocking this garbage. I've also started blocking browser versions. EG - if someone is using say IE6, FF 12, etc why? These are way outdated... commonly used by undesirables.

It's about keeping your site safe, allowing legit bots and punters access.

Code:
Host: fulltextrobot-77-75-77-17.seznam.cz
IP: 77.75.77.17
Score: 3
Violation count: 1
Why blocked: Dubious spider UA (SPD-0005). Bothost and/or Server Farm. Not an access provider ISP (HN-0152). Cloaked Dubious spider UA (HN-0153).
Query:
Referer:
User Agent: Mozilla/5.0 (compatible; SeznamBot/3.2; +hxxp://fulltext.sblog.cz/)

Host: mj8.naefmarco.ch
IP: 62.210.143.245
Score: 1
Violation count: 3 BANNED
Why blocked: Use of MJ12 bot not authorized by website admin (UA-0025).
Query:
Referer:
User Agent: Mozilla/5.0 (compatible; MJ12bot/v1.4.5; hxxp://www.majestic12.co.uk/bot.php?+)


Host: ec2-54-72-213-235.eu-west-1.compute.amazonaws.com
IP: 54.72.213.235
Score: 7
Violation count: 1 INSTA-BANNED
Why blocked: Amazon Web Services. Not an access provider ISP. Used by hackers, Keyword spamming SEO bots, and other unsavories (CLD-0AMZ). Checked for bypass - POST unescaped < (POST-008). POST unescaped > (POST-009). POST unescaped " (POST-012). POST RFI (POST-045). POST RFI (POST-047). POST RFI (POST-048). Heavy hit. INSTA-BAN. You have been instantly banned due to extremely hazardous behavior!
Query:
Referer:
User Agent:


Host: caprica.servers.rbl-mer.misp.co.uk
IP: 195.62.29.55
Score: 1
Violation count: 1
Why blocked: Bad UA. caprica.servers.rbl-mer.misp.co.uk is not a valid googlebot domain. See: hxxp://googlewebmastercentral.blogspot.com/2006/09/how-to-verify-googlebot.html  (UA-139.0).
Query:
Referer:
User Agent: Mozilla/5.0 (compatible; Googlebot/2.1; +hxxp://www.google.com/bot.html)


Host: security.lightspeedsystems.com
IP: 69.84.207.246
Score: 1
Violation count: 3 BANNED
Why blocked: LightSpeed systems censor bot, scrapes sites proactively instead of on use (HN-0134).
Query:
Referer:
User Agent: Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US


Host: 139.seven.greendata.pl
IP: 91.228.196.139
Score: 1
Violation count: 1
Why blocked: No registrations, or logins, from hosts listed as hostile on Stop Forum Spam (hxxp://www.stopforumspam.com/removal) (remote).
Query:
Referer:
User Agent: Mozilla/5.0 (Windows NT 5.1; rv:15.0) Gecko/20100101 Firefox/13.0.1
 
Last edited:
Top