home > Internet > Special index php topic powered by smf. Automatic forum engine detection

Special index php topic powered by smf. Automatic forum engine detection

Organized by Botmaster Labs, not planned. There is no time, the video is needed for the contest, as a newfangled trend, although it is easier to explain everything with good screenshots (my IMHO), and I don’t really want to shoot anything. There are very few profitable topics left, dumb spam no longer rules at all, here you need to think and no one will shoot the topic, if only the obsolete try to put it in a beautiful wrapping and powder a little. :) But this is not about us. In general, these 3 "not", I think, basically became barriers to participation in the competition for most potential participants. It's like with the repair of a car out of three: cheap, high-quality, fast - in the service they can only fulfill 2 conditions at the same time. sit and choose what is closer to you. :) It’s the same with the competition: I have time, I can make a video, but there is no topic, or I can make a video, I have a topic, but I don’t have time at all, or there is free time and there is a small topic, but the video scares. But it's good if 2 conditions are met at the same time. Okay, let's drop the lyrics. I will continue to myself. I didn’t plan to, so I will participate in the competition, I even chose which article I will vote for. Say what you like, but Doz knows the software very well and knows how to use it very sensibly. But today I learned that intrigue appeared in the competition. It turns out that I won't be able to vote, but only beginners who purchased the software in 2011 and the contest is designed for them can do it. I was a little surprised, but the owner is a gentleman. The competition is an advertising campaign and Alexander knows better how to conduct it. In general, I decided then to post an article, it is somewhat easier to write when it is clear for whom, in fact, it is impossible to do this for the entire collective farm.
Long introduction is over, now to the point.
What does a beginner need when he has acquired such a super-harvester, which is the Xrumer + Hrefer complex? That's right, learn how to work on it and discard the illusion that by starting to spam sheets, you can earn money. If you think so, donate your money to charity right away. You need to learn how to use the tools of the complex, preferably sharpening it for yourself. The time to "take more - throw more" is over. Quantity gives way to quality. So we will collect the base for ourselves, do not learn how to do it - you will fall behind the train. Of course, Khrefer will help us with this. If you plan to promote your resources on Google, then we also need to look for donor sites through Google. I think this is understandable and logical. But Google, as the mistress of the copper mountain, does not give away its wealth to everyone. It needs an approach. I would like to say right away that do not hope that according to the signs that you find in the public, you will be able to collect something. That's why they are available in public because they are worthless. I will not develop the topic further. It’s better to tell you how to assemble it correctly so that you see the result, the rest you will finalize yourself, the main thing is to understand the principle. It is necessary to collect according to the correct one according to the signs of specific engines we need, and not to the signs of forums in general. This is the main mistake of beginners - not to concentrate on a particular thing, but to try to cover everything as a whole. And yet, if you want to parse a more or less normal base, refuse to use operators in queries. No "inurl:", "site:", "title" etc. Google will ban searchers like you instantly. Therefore, we carefully study the engines with which Chrumer is currently working:

In Chrumer version 7.07, the program has been trained with several new engines:

forumi.biz, forumb.biz, 1forum.biz, 7forum.biz, etc.

phpBB-fr.com, Solaris phpBB theme

And the process of learning new things goes on continuously.
In general, we need to prepare the correct queries for parsing by Khrefer. Let's take forum dizhok as an example. SMF Forums. And we will begin to disassemble it into parts for parsing. Our beloved Google will help us with this. Entering a Google Query SMF Forums- a lot of garbage in the issue, rewind to some 13th page and select any link. I came across this one: http://www.volcanohost.com/forum/index.php?topic=11.0 . Let's open it and explore. We need to find something characteristic on the page that can be applied to the search for other pages on this engine. In the footer we notice the following inscription Powered by SMF 1.1.14, we quote it and enter it into Google, it shows us that for this request it knows about 59 million options. We skim through the links, add a couple more options to this keyword, for example, "Powered by SMF 1.1.14" poplar or Powered by SMF 1.1.14 viagra. We are convinced that the request is chic, in the issuance of only forums and almost no garbage for you.

Besides, we are not interested in quantity, but in quality, as I said above. Move on. From the same forum we take another phrase from the footer: , also quote it and feed it to Google. In response, he reveals that he knows more than 13 million results. Again, we skim through the output, add additional words and check the output with them. We make sure that the request is excellent and there is also almost no garbage. In general, there are already 2 iron requests. I suggest that the first forum be left alone for now and continue to collect requests from other forums. Fortunately, Google is open on request 2006-2008 Simple Machines LLC. We take from the issue, for example, these forums: http://www.snowlinks.ru/forum/index.php?topic=1062.0 and http://litputnik.ru/forum/index.php?action=printpage;topic=380.0 in the footers we take the following requests from them: "Powered by SMF 1.1.7" and "Powered by SMF 1.1.10" (I always advise you to drive requests for Khrefer in quotes, because we need quality first of all). I think it’s clear what we are doing, in the end we will have a certain database of queries for searching forums on the SMF engine (it was chosen as an example, with the rest of the engines it is similar).
It will look something like this:

"2006-2008, Simple Machines LLC"

And that's not all. When collecting versions of engines, we find the overhang "2001-2006, Lewis Media" in the footer on some SMF forums. We check this request, it also fully satisfies us. We find a similar query: "2001-2005, Lewis Media". Running the footers further we find the following request: "SMFone design by A.M.A, ported to SMF 1.1". Check it out - great. And so on. Half an hour of work and you have a wonderful database of queries on the engine, and for these queries Google will ban much less often than if you use operators in them. And at the same time, your database will be much cleaner than if you use queries like "index.php?topic=", because here Google will give not only the forums we need, but also a lot of left resources where it was possible leave a link to the forum topic. You can argue, they say, what's wrong with that? Others left a link, so we can. But! Links can be left not only by Hrumer, but also by other programs. moreover, they can be specially sharpened for leaving comments in a certain resource, the so-called highly specialized software, plus such links could be left by hand. Again, I repeat, it is not the quantity of junk that is important to us, but the quality, we will collect the base with the right requests anyway. The advantage of this method is that you will practically not need to configure sieve-filter , it can be simply turned off, because Google will practically not give you garbage.

I think that it is very important to learn how to correctly use Chrefer at the initial stage, because having learned this, you can always find a use for Chrumer, no matter how the situation changes. Protections are becoming more complicated, and if protection has been strengthened on some types of engines and Khrumer cannot cope with it at the moment, then there is no point in wasting resources on collecting these links, and then on working on them with Khrumer, it is better to focus on what gives the result . And at the same time, if the Botmaster Labs team has taught Chrumer something new, you can quickly dissect a new patient and prepare Chrumer's base while the patient is still warm. Time is money, the resource may no longer be relevant when you buy the base. collected by someone. In addition, the correct collection of bases for yourself greatly expands the "white" use of Khrumer. And this is exactly where everything is moving, whether we like it or not, but the process of whitening or graying is going on. Black sheets for everything you can go into the past.
All other, already technical aspects of working with Hrefer can be viewed in the help and it makes no sense to dwell on them, all goals-points-seconds are set empirically for each car individually.
As a bonus, I'll post here a template for parsing the Chinese search engine Baidu, the other day I was asked about it, so I did it in between times, sorry for the pun. :)

Hostname=http://www.baidu.com
Query=s?wd=
LinksMask=
TotalPages=100
NextPage=
NextPage2=
CaptchaURL=
CaptchaImage=
CaptchaField=

I tried to test parse them, there was no ban, Khrefer collected resources quickly, all requests for parsing were similar to those of Google, but there were a sea of Chinese resources, and with a high PR, and besides, a European had not set foot in a lot of places. It is better to parse with Chinese requests. Google Translate will help with this, type a list of keywords in Russian and translate it into Chinese. The truth in words"Chrefer's words cannot be put together in Chinese, they must be recoded.
Instead of Chinese:

伟哥 - viagra

吉他 - guitar

其他 - rest

保险公司 - insurance

Put these codes in the Words file to replace them:

%E4%BC%9F%E5%93%A5

%E5%90%89%E4%BB%96

%E5%85%B6%E4%BB%96

%E4%BF%9D%E9%99%A9%E5%85%AC%E5%8F%B8

If you are promoting a site for insurance, then by posting a link in your profile on a thematic (!) even a Chinese forum found on request " forum SMF" 保险公司 will be very good.
In conclusion, I would like to say that I never understood people who complained that Khrefers were bad or not parsing, I always wanted to say this, you just don’t know how to cook them. Not a single parser is better than a referrer, it just needs to be correct. Hrefer is a car: good, solid, made in German, but a person controls it and it all depends on how sensibly it is driven, you can’t force the car to go right and left at the same time.
A separate topic is the cleaning of bases, I once 3 years ago for the previous contest. Everything is still relevant from more there, but now you can refuse to check for 200 OK, I really didn’t really like this process, the errors were very large, a lot of superfluous was filtered out. Now this can be done almost automatically during Chrumer's work, although this process is not a complete analogue of checking for "200 OK". In general, to the point: not so long ago, a wonderful opportunity appeared in Khrumer - to rob information from resources at the time of the project run. It looks like this. You drive in a template that will be processed during the work, and the information collected according to the template will be entered into the xgrabbed.txt file in the Logs folder. You can use this function for anything, the flight of fancy is huge. I use this function once a week to remove links from the "expired" working base. It's no secret that forums die off every day in order to clean up the base from such resources and the "Autograbbing" tool will help us in this case.
After all, you must admit, often typing, for example, http://www.laptopace.com/index.php, we see that this domain is already, for example, goudyadya, but there is no forum there. So, in order to throw this slag out of the base, we will rob. :) We open the source code of the page and see this entry there:

laptopace.com

For grabbing, we convert it to

[...]

Now all the "dead" from goudaddi will be known to us by name.
Here is a small selection for the "Autograbbing" tool, if you want to clean the database from different "expired" domains:

[...]

[...]
[...]
[...]

[...]

This domain may be for sale. [...]Buy this Domain