Common Screens

encyclopedic internet natural language processing

Description

A corpus of web screenshot and metadata data composed of over 70 million websites.

Update Frequency

Monthly

License

Attribution 4.0 International (CC BY 4.0)

Documentation

https://commonscreens.com/?page_id=1492

Managed By

See all datasets managed by Common Screens.

Contact

admin@commonscreens.com

How to Cite

Common Screens was accessed on DATE from https://registry.opendata.aws/comonscreens.

Usage Examples

Tutorials

Resources on AWS

  • Description
    Common Screens (jpeg and csv format)
    Resource type
    S3 Bucket
    Amazon Resource Name (ARN)
    arn:aws:s3:::common-screens
    AWS Region
    us-west-2
    AWS CLI Access (No AWS account required)
    aws s3 ls --no-sign-request s3://common-screens/
  • Description
    Cloudfront CDN distribution for hotlinking screenshots
    Resource type
    CloudFront Distribution
    Hostname
    dqh5x5k6xg3n1.cloudfront.net
    AWS Region
    us-west-2

Edit this dataset entry on GitHub

Tell us about your project

Home