This paper discusses the objectives, process, and outcomes of creating a digital dataset for a historical research project on the tribute system in Heilongjiang during the Qing dynasty (1644-1911). Relevant information from non-digital primary sources was compiled for the dataset to facilitate quantitative and qualitative analyses of the system’s attributes. In the course of curating the data, the investigators addressed the challenges of defining a common set of variables and matching Chinese original data with English translations. They tested methods of learning to create datasets that could accommodate heterogeneous sources and share among multiple users. pp.84-95