The publishing of open data is considered a key element for civic participation paving the way tothe ‘public value’, a term which underpins the social contribution. A result of that can be seenthrough the popularity of data portals published all around the world by governments, publicand private organizations. However, the diffusion of data portals raises concerns aboutdiscoverability and validity of these data sources, especially to what extent they contribute toopen data and open science. The purpose of this work is to develop a framework to reveal opendata publishing with the use of a freely available open science project called Common Crawl. Theidea is to identify open data-related initiatives and to gather information about their availability,having in the framework’s essence an iterative and differential process. The main outcome isshown through a proposed model for the historical data repository which involves both use andcreation of open science to branch new sort of research possibilities based on publishing ofderived data.