Notes on the files: =================== citywords_metro_word_cnt_unrestricted_url_filtered_metro_ordered.rpt.gz comma-separated values in GZIP compression column headers: word,cbsacode,population,cnt aggregate_metrics_data_w_fit_results.json each line is a JSON containing the data and results of the power-law fits for the aggregate metrics data is contained in keys "x" (population) and "y" (metrics) "model" stands for the model from Leitao et al. 2016 wordcount_fits_results.json.gz gzipped line by line JSON file contains data and results for the power-law fits of words data is contained in keys "x" (population) and "y" (metrics) Delta BIC is calculated as compared to the total word volume, e.g. a fixed gamma model from Leitao et al. with 1.0207 zipf.csv comma-separated values contains how many times a certain word count occurred in the total corpus column headers: total_word_count,count_of_total_word_count citywise_zipf.csv comma-separated values contains the results of Zipf's law fits on citywise corpora column headers: cbsacode,beta,beta_error,total_wordcount