收藏 分销(赏)

偏倚抽样问题的选择性评论及其在现代统计学中的应用.pdf

上传人:自信****多点 文档编号:3656288 上传时间:2024-07-12 格式:PDF 页数:35 大小:277.74KB
下载 相关 举报
偏倚抽样问题的选择性评论及其在现代统计学中的应用.pdf_第1页
第1页 / 共35页
偏倚抽样问题的选择性评论及其在现代统计学中的应用.pdf_第2页
第2页 / 共35页
偏倚抽样问题的选择性评论及其在现代统计学中的应用.pdf_第3页
第3页 / 共35页
亲,该文档总共35页,到这儿已超出免费预览范围,如果喜欢就下载吧!
资源描述

1、应用概率统计第 40 卷第 2 期2024 年 4 月Chinese Journal of Applied Probability and StatisticsApr.,2024,Vol.40,No.2,pp.229-263doi:10.3969/j.issn.1001-4268.2024.02.002Selective Review of Biased Sampling Problems withApplications in Modern StatisticsQIN Jing(National Institute of Allergy and Infectious Diseases,NIH

2、,Bethesda,MD 20892,USA)Abstract:Biased sampling is a pervasive issue that transcends various disciplines,impactingfields such as econometrics,epidemiology,medicine,survey research,and more recently,machinelearning and artificial intelligence(AI).This ubiquitous challenge arises when the selection of

3、 datapoints for analysis or research introduces systematic biases,potentially compromising the accuracyand reliability of research outcomes.In this paper,our objective is to provide a comprehensiveoverview of the foundational concepts related to biased sampling problems and the methods ofinference.F

4、urthermore,we aim to establish a connection between biased sampling issues and themore recent discussions in machine learning regarding distribution shift problems.Additionally,we will delve into the latest advancements in biased sampling,particularly within the context oftransfer learning and confo

5、rmal inference for predictive confidence intervals.Our ultimate goal is topresent this material in a manner that is accessible to graduate students,enabling them to identifyapplications of biased sampling problems within their own research endeavors.It is with deep respect and gratitude that we dedi

6、cate this paper to the memory of the late ProfessorShisong Mao,whose guidance and wisdom have been invaluable throughout the years.Keywords:biased sampling problems;casual inference;conformal predictive interval;distribu-tional shift;transfer learning;in memory Professor Shisong Mao2020 Mathematics

7、Subject Classification:62D20;62G20;62G05Citation:QIN J.Selective review of biased sampling problems with applications in modernstatisticsJ.Chinese J Appl Probab Statist,2024,40(2):229263.1Interactions with Professor Shisong MaoIn the world of academia,the passing of a distinguished professor is a po

8、ignant mo-ment,marked not only by the loss of an esteemed scholar but also by the legacy he or sheleaves behind.It is a time when we come together to remember,honor,and celebrate theprofound contributions of a remarkable individual whose work has indelibly shaped thefield of statistics.This paper is

9、 a tribute to the enduring influence of Professor ShisongMao,whose wisdom,dedication,and innovative thinking continue to inspire us.ProfessorE-mail:jingqinniaid.nih.gov.Received November 9,2023.Revised January 29,2024.230Chinese Journal of Applied Probability and StatisticsVol.40Mao was not merely a

10、 statistician but a luminary whose work transcended the boundariesof traditional statistical analysis.With a passion for both the art and science of statistics,he broke new ground in research,teaching,and mentorship.His career was characterizedby a relentless pursuit of knowledge,a commitment to sta

11、tistical rigor,and an unwaveringbelief in the power of data to illuminate the mysteries of the world.In the traditional academic setting,its customary for each professor to mentor onlya limited number of graduate students due to various constraints.Recognizing the criticalshortage of statistics educ

12、ators,the Chinese Department of Education took a proactivestep by introducing a two-year program in 1984 to enroll college graduates for the studyof statistics.This initiative aimed to address the growing demand for skilled statistician-s in various sectors.Under the visionary leadership of Professo

13、r Mao,the Departmentof Statistics at East China Normal University embraced this educational challenge.Inan exemplary move,the department opened its doors to a cohort of 24 special gradu-ate students specializing in statistics.This significant addition complemented ProfessorMaos ongoing mentorship of

14、 his two regular three-year graduate students.This boldand forward-thinking decision not only expanded the horizons for aspiring statisticiansbut also underscored Professor Maos commitment to nurturing future talent in the fieldof statistics.It reflected his dedication to bridging the gap in statist

15、ics education andinspiring the next generation of statisticians in China.My initial encounter with Professor Mao occurred during the spring of 1984.I had justreceived the news that I had successfully passed the preliminary graduate student entryexamination,which marked the beginning of a transformat

16、ive journey.My destination wasEast China Normal University,where I was to face the second round of oral examinations.This momentous occasion held great significance for me,not only in terms of my academicaspirations but also because it marked my first venture from a remote and relatively smalltown i

17、n Sichuan Province.The town,known as Wanzhou,would later become part ofChongqing Special District due to the construction of the Three Gorges Dam.It was aplace where life moved at a slower pace,far removed from the bustling metropolis thatawaited me in Shanghai,Chinas largest city.The transition fro

18、m a tranquil and close-knitcommunity to the dynamic and sprawling urban landscape of Shanghai was a monumentalshift.As an unsophisticated young middle school math teacher stationed in an isolatedtownship,I embarked on my first-ever solo journey to a dynamic urban center.The mixof excitement and nerv

19、ousness was palpable.The prospect of studying in Shanghai andpursuing my academic dreams was exhilarating,but it was also accompanied by a senseNo.2QIN J.:Selective Review of Biased Sampling Problems with Applications in Modern Statistics231of the unknown.The citys vastness and the anonymity of its

20、busy streets were boththrilling and intimidating.It was a stark contrast to the closely bonded community I wasaccustomed to.The cacophony of traffic,the towering skyscrapers,and the neon lightspainted a picture of a world entirely different from what I had known.My situation indeedresonates with the

21、 description in the famous Chinese novel“Dream of the Red Chamber”,where Grandma Lius entry into the grand house brings her face-to-face with an entirelyunfamiliar and overwhelming environment.In my case,the parallel may be even morepronounced,given the added challenge of grappling with a different

22、dialect,the Shanghaiaccent.Much like the character in the story,my experience of joining the platform createdby Professor Mao,with its innovative approach and a diverse group of graduate students,might have felt like stepping into a world filled with novel experiences and opportunities.Unquestionabl

23、y,this adventure was a vital step in my personal and professional growth.It symbolized not only a geographical transition but a leap into the uncharted waters ofhigher education and self-discovery.The small-town math teacher was on the brink of anew chapter,eager to embrace the challenges and opport

24、unities that the big city had tooffer.Professor Maos reputation extends far and wide,earning him the respect and recog-nition of colleagues from universities across the academic landscape,especially within thefield of statistics.As we had the privilege of meeting Professor Mao,I observed that manyof

25、 my fellow students,representing various universities apart from East China Normal U-niversity,could extend warm regards from their own professors.However,when it came tomy turn to exchange handshakes with Professor Mao,I found myself in a unique position.None of my professors had a prior connection

26、 with him,primarily due to the unfamiliaritybetween him and my undergraduate college.In light of this,I chose to express my personaladmiration and warm regards,addressing Professor Mao with sincerity,“Professor Mao,please accept my heartfelt greetings.”In response,Professor Mao warmly reciprocatedwi

27、th a smile,creating a brief yet meaningful connection that exceeded any prior lack offamiliarity.During my tenure at East China Normal University from 1984 to 1988,I frequentlyheard Professor Mao underscore the significance of nurturing a profound passion for theart of data collection.He advocated t

28、reating data with the same care and devotion onemight reserve for a loved one.Moreover,he encouraged individuals to delve deeply intothe data,allowing the information to organically weave its own narrative.In the autumnof 1987,just like any other departments,the statistics department was bustling wi

29、thpreparations for the schools upcoming anniversary celebration.As tradition dictated,232Chinese Journal of Applied Probability and StatisticsVol.40a compelling lecture based on ones own research was the chosen way to contribute tothis special occasion.However,during this particular period,I found m

30、yself immersed ina different endeavor completing the rigorous application process for graduate schoolin North America.My days were filled with the arduous task of completing numerousapplication forms and preparing for the TOFF(Test of Foreign Language)examination.One day,while I was engrossed in the

31、se preparations,Professor Mao,a respected figurein the department,approached me with a question about my research.I couldnt helpbut feel a wave of embarrassment wash over me,for I had no research work to speak ofat that point.It was an awkward moment as I explained my current circumstances toProfess

32、or Mao.In response,Professor Mao,a seasoned and understanding mentor,offereda reassuring gesture.“I can excuse you this time,”he said with a kind smile,“but I hopethis is the last time.”This encounter with Professor Mao served as a pivotal moment inmy academic journey.It was a gentle nudge,a reminde

33、r of the importance of researchand the academic commitment that lay ahead.From that day forward,I embarked on aquest to delve into the world of statistical research,determined to ensure that it wouldindeed be the last time I found myself unprepared in the presence of my academic peersand mentors.Lit

34、tle did I know that this experience would serve as the catalyst for a richand rewarding academic journey,one that would ultimately lead me to make significantcontributions to the field of statistics.In hindsight,I am grateful for Professor Maosguidance and understanding,as it ignited a passion for r

35、esearch that continues to shapemy career in the world of statistics.In late 1991 and early 1992,as I was deeply immersed in my Ph.D.journey at theUniversity of Waterloo,a special and cherished connection blossomed in my academic life a close bond with Professor Mao and Professor Jixiang Zhou,another

36、 distinguishedprofessor from East China Normal University.Their visit to our university during that pe-riod marked a pivotal and memorable chapter in my academic and personal development.Professors Mao,Zhou,and I engaged in numerous discussions that covered a wide spec-trum of subjects,ranging from

37、the intricacies of statistical methodologies to the broadertapestry of life itself.These conversations were not confined to lecture halls or meetingrooms;we extended our discussions beyond the academic sphere.In fact,many eveningsfound us teaming up in the kitchen to prepare dinner together.These sh

38、ared culinaryexperiences transcended mere meal preparation;they served as an extension of our intel-lectual and personal connection.As we chopped,stirred,and simmered,we continued ourexchanges on topics that spanned from academic challenges to the joys and complexitiesof life.No.2QIN J.:Selective Re

39、view of Biased Sampling Problems with Applications in Modern Statistics233This period of close interaction with Professor Mao left an in-erasable mark on myacademic and personal journey.It was more than just a professor-student relationship;itwas a mentorship filled with profound insights,camaraderi

40、e,and shared experiences.Thewisdom I gained from these discussions,both academic and personal,has continued toshape my path in profound ways.As I reflect on those shared dinners and conversations,I am reminded of the lasting impact of those moments and the invaluable guidance thatProfessor Mao provi

41、ded during my formative years in academia.His visit to the Universityof Waterloo was not just an academic event;it was a transformative experience thatcontinues to inspire and influence my academic and personal pursuits to this day.Professor Mao emphasized the paramount importance of mastering the a

42、rt of datacollection in the realm of scientific research.To advance our understanding and insights,we must dedicate our utmost efforts to this fundamental aspect.However,the reality ofpractical applications often brings forth an inescapable challenge:the specter of selec-tion bias.In this paper,we e

43、mbark on a comprehensive exploration of the concept ofbiased sampling.We delve into the nuances of this topic,dissecting its implications,andtracing its impact on the fields of modern statistics and machine learning.Our objec-tive is to unravel the multifaceted nature of biased sampling,shed light o

44、n its real-worldconsequences,and investigate its relevance in the contemporary landscape of data-drivendisciplines.Through this discourse,we aim to foster a deeper appreciation for the intri-cacies of data collection,the challenges it presents,and the innovative solutions that arisein the face of se

45、lection bias.2Introduction on Biased Sampling and Distribution Shift in MachineLearningBiased sampling is a phenomenon that arises when an investigator collects samplesfrom a population in a way that the resulting sampling distribution differs from the char-acteristics of the target population.This

46、disparity in distribution occurs because,underthe chosen sampling method,not all units within the population have an equal oppor-tunity to be included in the sample.In other words,the natural sampling plan,whileconvenient and often intuitive,inadvertently introduces disparities in the representation

47、of different segments of the population,thus leading to a biased sample.This occur-rence can significantly impact the validity and generalizability of the research findings,underscoring the importance of understanding and mitigating bias in the sampling pro-cess.Biased sampling issues are indeed wid

48、espread,transcending various domains such234Chinese Journal of Applied Probability and StatisticsVol.40as survey sampling,epidemiology studies,econometrics,and recently in machine learningliterature.These challenges are not limited to one specific field but have the potential toaffect the quality an

49、d validity of research in a wide array of disciplines.As pointed out byProfessor James Heckman1,the 2000 Nobel Laureate in Economics,“Sample selectionbias may arise in practice for two reasons.First,there may be self selection by the indi-viduals or data units being investigated.Second,sample select

50、ion decisions by analystsor data processors operate in much the same fashion as self selection”.In the ever-evolving landscape of machine learning,data is the lifeblood that fuelsthe algorithms driving everything from recommendation systems to image recognition andnatural language processing.Data,ho

展开阅读全文
相似文档                                   自信AI助手自信AI助手
猜你喜欢                                   自信AI导航自信AI导航
搜索标签

当前位置:首页 > 学术论文 > 论文指导/设计

移动网页_全站_页脚广告1

关于我们      便捷服务       自信AI       AI导航        获赠5币

©2010-2024 宁波自信网络信息技术有限公司  版权所有

客服电话:4008-655-100  投诉/维权电话:4009-655-100

gongan.png浙公网安备33021202000488号   

icp.png浙ICP备2021020529号-1  |  浙B2-20240490  

关注我们 :gzh.png    weibo.png    LOFTER.png 

客服