大数据研究中的两个流派及两类大数据—

引用格式：薛永红, 董春雨. 大数据研究中的两个流派及两类大数据——基于案例的研究[J]. 科技导报, 2021, 39(13): 125-133; doi: 10.3981/j.issn.1000-7857.2021.13.014

大数据研究中的两个流派及两类大数据——基于案例的研究

薛永红¹，董春雨²

1. 华北科技学院理学院，三河 065201
2. 北京师范大学哲学学院，北京 100875

收稿日期：2020-06-09；修回日期：2020-09-16

基金项目：教育部人文社会科学研究青年基金项目（20YJC720025）；国家社会科学基金重点项目（18AZX008）

作者简介：薛永红，副教授，研究方向为科学哲学、科学文化传播，电子信箱：aristotle@ncist.edu.cn

摘要关于大数据的研究，学界已经形成了泾渭分明且针锋相对的两个大数据流派——激进派与保守派。通过对2个经典大数据案例的研究，发现“大数据”实际上指称两类既有区别又有联系的对象，一类是“用数据的方法研究科学”，另一类是“用科学的方法研究数据”。两类大数据及二者存在的显著差异，是形成激进派与保守派两种阵营的原因。在归纳了两类大数据各自特点的基础上，提出了从根本上消除目前这种对立且混乱的认识现状，并将大数据研究推向深水区的路径。

关键词：大数据激进派保守派谷歌流感预测人类数感

[1]	Alvarado R, Humphreys P. Big data, thick mediation, and representational opacity[J]. New Literary History, 2017, 48(4):729-749.
[2]	欧高炎, 朱占星, 鄂维南, 等. 数据科学导引[M]. 北京:高等教育出版社, 2017.
[3]	Schönberger M V, Cukier K. Big data, a revolution:that will transform how we live, work, and think[M]. Boston:Houghton Mifflin Harcourt, 2013.
[4]	Clark L. No questions asked:Big data firm maps solutions without human input[EB/OL].[2020-04-10]. http://www.wired.co.uk/news/archive/2013-01/16/ayasdi-big-data-launch.
[5]	Sprenger J. Science without (parametric) models:The case of bootstrap resembling[J]. Synthese, 2011, 180(1):65-76.
[6]	Anderson C. The end of theory:The data deluge makes the scientific method obsolete[J]. Wired, 2008, 16(7):1-3.
[7]	冯启思. 大数据统治世界[M]. 曲玉彬, 译. 北京:中国人民大学出出版社, 2013.
[8]	Floridi L. Big data and their epistemological challenge[J]. Philos and Technol, 2012, 25(4):435-437.
[9]	董春雨, 薛永红. 从经验归纳到数据归纳:特征、机制与意义[J]. 自然辩证法研究, 2016, 32(5):9-16.
[10]	Timmer J. Why the cloud cannot obscure the scientific method[EB/OL].[2020-04-10]. http://arstechnica.com/uncategorized/2008/06/why-the-cloud-cannot-obscurethe-scientific-method.
[11]	Brooks D. What you'll do next:using big data to predict human behavior[N]. The New York Times, 2013-04-16.
[12]	Boyd D, Crawford K. Six provocations for big data[J]. Social Science Electronic Publishing, 2011, 123(1):1-17.
[13]	Sabina L. Integrating data to acquire new knowledge:Three modes of integration in plant science[J]. Studies in History & Philosophy of Biological & Biomedical Sciences, 2013, 44(4):503-514.
[14]	Canali S. Big data, epistemology and causality:Knowledge in and knowledge out in EXPOsOMICS[J]. Big Data & Society, 2016, 3(2):1-11.
[15]	Pietsch W. Aspects of theory-ladenness in data-intensive science[J]. Philosophy of Science, 2015, 82(5):905-916.
[16]	Frické M. Big data and its epistemology[J]. Journal of the Association for Information Science & Technology, 2015, 66(4):651-661.
[17]	Hey T, Tansley S, Tolle K. The fourth paradigm:data-intensive scientific discovery[C]. Microsoft Research, 2009.
[18]	Berry D. The computational turn:Thinking about the digital humanities[EB/OL].[2020-04-10]. https://culturemachine.net/wp-content/uploads/2019/01/10-Computational-Turn-440-893-1-PB.pdf.
[19]	Harford T. Big data:Are we making a big mistake?[J]. Significance, 2015, 11(5):14-19.
[20]	Kitchin R, Lauriault T P. Small data in the era of big data[J]. Geojournal, 2015, 80(4):463-475.
[21]	曹贤才, 时冉冉, 牛玉柏. 近似数量系统敏锐度与数学能力的关系[J]. 心理科学, 2016, 39(3):580-586.
[22]	Halberda J, Ly R, Wilmer B, et al. Number sense across the lifespan as revealed by a massive internet-based sample[J]. PNAS, 2012, 109(28):11116-11120.
[23]	Ginsberg J, Mohebbi M H, Patel R S, et al. Detecting influenza epidemics using search engine query data[J]. Nature, 2009, 457(7232):1012-1015.
[24]	Hey T, Tansley S, Tolle K. 第四范式:数据密集型科学发现[M]. 潘教峰, 张晓林, 译. 北京:科学出版社, 2013.
[25]	Pietsch W. The causal nature of modeling with big data[J]. Philosophy & Technology, 2016, 29(2):1-35.
[26]	朱迪亚·珀尔, 达纳·麦肯齐. 为什么:关于因果关系的新科学[M]. 江生, 于华, 译. 北京:中信出版集团, 2019.