Dryad dataset: Gene-language models are whole genome representation learners https://datadryad.org/dataset/doi:10.5061/dryad.vx0k6djzn