Go to top

Participate

Scrapinghub participates in Google Summer of Code as a sub-organization of the Python Software Foundation.

To participate as a student, follow the instructions in the Python Summer of Code documentation. See also the Google Summer of Code student guide.

See below some guidelines specific to the Scrainghub sub-organization.

Choosing an Idea

We have built a list of ideas that we consider good choices for a student application.

These are just our ideas, feel free to come up with your own ideas:

  • To discuss your own idea for one of our projects, open a issue on the GitHub page of the corresponding project.

  • To discuss an idea for a new project related to web crawling or data processing technologies, contact us at opensource@scrapinghub.com.

Communication

  • Use Github for communication when possible.

    For ideas that have an associated GitHub issue, discuss them in that issue. If there is no issue for your idea yet, open an issue for it and discuss your thoughts there.

    If you want to show us some code, create a pull request. You can mark your pull request as a ‘draft’ if it is not meant to be merged.

  • If you have any doubt, do not hesitate to contact us at opensource@scrapinghub.com.

Pre-Application Pull Request

  • Get familiar with Git and GitHub. You need to know how to fork a repository, create a branch, create a pull request, and address merge conflicts.

  • If the target project has contribution guidelines, follow them. Some projects have a ‘Contributing’ page or section in their documentation or README file. Other projects have a CONTRIBUTING file.

  • If you can identify a self-contained, short task (less than a week of work) from your idea, start working on it. It can help your application.

  • Check open issues in the GitHub page of the corresponding project. Select Contribute in the project list to find recommended issues for beginners.

  • Another good source for contributions is finishing old abandoned pull requests. Ask before, to make sure about the work status, and why it was stalled.

  • Be sure you read issue threads thoroughly. They may often include previous or on-going attempts at fixing them.

  • When creating a pull request that fixes a registered issue, indicate so with an issue-closing keyword.

  • Make sure your pull requests have proper documentation and tests. All current tests should pass to get your changes merged.

  • Do not get discouraged by our feedback! We will probably ask you for more changes than from regular contributors to test how responsive you are and how well you implement our requests.

Application

  • Your proposal is a great opportunity to show us your research skills and dedication, so try to investigate as much as you can on your own and make narrower questions by then.

  • Hard ideas are favoured over easy ones, but quality proposals and development skills are going to be ranked even higher.

  • It is recommended that you start an early draft of your proposal somewhere publicly accessible, so mentors and users can review it and provide feedback. It is easier to answer questions or help you if you get stuck knowing your work beforehand too.