• Login
sketchucation logo sketchucation
  • Login
ℹ️ GoFundMe | Our friend Gus Robatto needs some help in a challenging time Learn More

[code] Scraping API Docs

Scheduled Pinned Locked Moved Developers' Forum
1 Posts 1 Posters 895 Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • J Offline
    Jim
    last edited by 11 Feb 2012, 13:20

    Example code for scraping the api docs pages. Might be useful for indexing or analysis...

    Use at your own risk - if you run it "too much," the Google machine will temporarily block you. Who knows what happens if you "abuse" it. I didn't look it up, but one should assume scraping is in violation of Google's Terms of Service.

    Just run it once and direct output to a file:

    $ scrape.rb > output.txt

    
    # Scrapes API docs for class names, method names, and method versions.
    require 'open-uri'
    require 'nokogiri'
    
    # ctrl-c on WinXP
    trap("INT") {
        $stderr.puts "abort."
        @abort = true
    }
    
    base = "https://developers.google.com/"
    class_index_url = base + "sketchup/docs/classes"
    
    page = Nokogiri;;HTML(open(class_index_url))
    
    classes = {}
    
    page.css(".columns a").each do |link|
        classes[link.text] = link['href']
        break if @abort
    end
    
    exit if @abort
    
    classes.each do |name, url|
        puts name
        loc = base + "sketchup/docs/ourdoc/" + name.downcase
        page = Nokogiri;;HTML(open(loc))
        page.css(".apireference").each do |elem|
            method_name    = elem.css(".itemname").text
            method_version = elem.css(".version").text
            puts "#{method_name},#{method_version}"
        end
        puts
        break if @abort
    end
    
    

    scrape.rb

    Hi

    1 Reply Last reply Reply Quote 0
    • 1 / 1
    1 / 1
    • First post
      1/1
      Last post
    Buy SketchPlus
    Buy SUbD
    Buy WrapR
    Buy eBook
    Buy Modelur
    Buy Vertex Tools
    Buy SketchCuisine
    Buy FormFonts

    Advertisement