• Home

  • Savage Settings

  • Pod Casts

  • Contact

  • FAQ

  • Members

  • More

    Use tab to navigate through the menu items.
    • facebook
    • generic-social-link
    0
    Untitled

    Fast, Fun, Furious

    To see this working, head to your live site.
    • Categories
    • All Posts
    • My Posts
    mahfuj Cr7
    Aug 03

    Analyzed 70,000 apps of Wandoujia, all of which were unexpected

    in Savage This!

    Use Scrapy to crawl 70,000+ apps on the entire network of Wandoujia and conduct exploratory analysis. If you are not interested in the data capture section, you can directly drop down to the data analysis section. 1. Analysis background Previously, we used Scrapy to crawl and analyze 6000+ apps on Kuan.com. Why is this article talking about app grabbing? Because I like tossing apps, haha. Of course, mainly because of the following points: First, the previously crawled webpage is very simple When crawling Kuan.com, we use a for loop to complete the crawling of all the content after traversing hundreds of pages. It is very simple, but in reality it is often not so easy. To take the data of the entire website, in order to enhance the crawler skills, this article chose the "pea pod" website.


    The goal is to crawl the app information under all categories of the website and download the app icons . The number is about 70,000 , which is an order of magnitude higher than Kuan. Second, practice using the powerful Scrapy framework again I have only used Scrapy for crawling initially, and I have not fully understood how powerful Scrapy is, so this phone number list article tries to use Scrapy in depth, adding settings such as random UserAgent, proxy IP and image download. Third, compare the two websites of Kuan and Wandoujia I believe that many people are using Wandoujia to download apps, and I use Kuan more, so I also want to compare the app features of these




    two websites. Without further ado, let's start the crawling process. 1. Analytical goals First of all, let's take a look at what the Wandoujia webpage to be crawled looks like. You can see that the apps on the website are divided into many categories, including: "App Play", "System Tools", etc. There are a total of 14 major categories. Category, each category is subdivided into multiple subcategories, for example, video playback includes: "video", "live broadcast", etc. Click "Video" to enter the second-level sub-category page, and you can see some information of each App, including: icon, name, number of installations, volume, comments, etc. Then, we can go to the third-level page, that is, the details page of each app, and we can see parameters such as the number of downloads, the



    0 comments
    0
    Comments
    0 comments
    Similar Posts
    • As If That Were Not Enough,
    • 他们代表巴西不存在规模的安全威胁
    • Google’s “Disavow Links Tool”: The Complete Guide

    Star City Savages RPG

    Roanoke, VA, USA

    • facebook
    • generic-social-link

    ©2020 by Star City Savages RPG. Proudly created with Wix.com