语料中国

首页 > 语料中国 > 正文

首届全国学习者语料库专题研讨会First National Symposium on Learner Corpora

发布时间:2026-05-08 浏览量:

首届全国学习者语料库专题研讨会First National Symposium on Learner Corpora

研讨会议程PDF下载

研讨会会议手册PDF下载

20091226日至27日由北京外国语大学中国外语教育研究中心主办,外语教学与研究出版社承办的“首届全国学习者语料库专题研讨会”在北京外国语大学中文学院成功举办。

本次会议汇集了国内从事学习者语料库相关研究的专家学者100余名。会议先后进行了7场(卫乃兴、何安平、李文中、许家金、濮建忠、洪华清、梁茂成)高质量的主旨发言和60多篇论文宣读和交流。本次研讨会就学习者语料库建设、研究方法和教学应用等做了充分的讨论。在回溯反思以往学习者语料库研究的基础之上,与会代表表现出很强的创新求变的意识。比如,多位学者不满足于单纯基于文本的经典语料库研究,而利用多媒体网络编程技术,将音、视频和文字整合构建多模态学习者语料库。再如,将学习者语料库分析层面从词项为基础的局部语言单位的研究,扩展至词汇语法的共现,以及语义倾向和语义韵的多维度综合考察。会议讨论生发出的诸多新的研究思路、视角和技术实现方法预示着中国学习者语料库的研究美好的明天。

1226日上午,北京外国语大学中国外语教育研究中心主任文秋芳教授及外语教学与研究出版社高等英语教育出版分社社长常小玲女士分别致辞,拉开了研讨会的帷幕。

随后中国语料库语言学研究会会长上海交通大学卫乃兴教授和华南师范大学何安平教授分别做了题为“Phraseology in Contrast: A Multi-dimensional Contrastive Analysis of Learner Phraseologies across Corpora”和“过多使用,过少使用,错误使用,so what?”的主旨发言。卫教授首先回顾了语料库语言学及学习者语料库研究的发展历程和主要研究取向,然后通过bent on等实例从语言层面(词语搭配、类联接、语义倾向和语义韵)和对比层面(中介语语料、母语语料、双语平行语料)提出了有关学习者短语研究的多维分析思路,增强了分析的层次和深度,并可以对学习者语言使用问题做适当解释。何安平教授的报告则直指学习者语料库研究中的一个突出问题,即重描写轻阐释。何教授借用中国学生过多使用的I think为例,指出除了关注频数高低外,还可以观察I think的语义倾向和语义韵等周边语言环境,从而分析出这些词项的语用特点,发现中国学生表意不足情态有余的问题。

1226日下午,河南师范大学李文中教授和中国外语教育研究中心许家金博士分别做了题为“Constructing a Learner English Portfolio Corpus on the Open Corpus Platform: A Research Proposal”和“Storied Self in Another Language: A Collocational Approach to Interlanguage Identity”的主旨发言。李文中教授针对目前学习者语料库采集的多为共时语言产出的现象,提出构建基于学习网络系统的动态学生语料采集机制。这种前端为网络语言学习,后端为形成性学生语料汇集的思路,被李老师称为“电子档案袋语料库”。这种动态实时的学习者语料库可作为对学生进行形成性评价的重要资源。李老师还演示了他的团队开发的结合文本和音视频收集检索的“电子档案袋语料库”的多模态语料收集检索原型平台。许家金博士的报告另辟蹊径,一反目前学习者语料库研究关注英语学习错误诊断的常规,利用学习者语料库开展了有关中国英语专业大学生的话语认同的社会语言学研究。许博士通过关键主题词(key keywords)和框合结构(concgram)的方法,从主体特征、社会网络、互动语力三个层面入手构建了一个抽象的中国英语专业大学生“施惟可(SWECCL)”的自我形象。

1227日上午,解放军外国语学院的濮建忠教授和新加坡南洋理工大学国立教育学院的洪华清博士分别做了题为“从词块的使用看学习者语言能力的发展”和“Multimodal Learner Corpus Construction: Challenges and Directions”的主题发言。濮建忠教授利用自己收集的历时学生语料,就findlifehard几个词为核心的词块从词语搭配、类联接、语义倾向和语义韵几个层面做了细致的分析。在方法上,还采取了主题词丛的比较。濮教授的分析表明,经过两年左右时间的学习,学习者在词块的各个层面进步不明显。这应当引起教学工作者的重视。洪华清博士介绍了他开发的新加坡教育语料库(SCoRE)及其多层标注和检索平台。该语料库充分整合了转写文本、课堂录像和语言学和教育学研究者的分析数据,使得教育研究者可以方便地检索课堂教学中的学习者或者教师的互动行为,从而为诊断教学和课程改革提供事实依据。

1227日下午,中国外语教育研究中心的梁茂成教授做了题为“Beyond Keywords”的主题演讲。梁教授针对目前主题词工具存在的计算缺陷,改进了主题词计算方法。他主张提取主题性词块(key clusters)和主题性结构(如he V-ed that)时,应将词块和结构的出现次数,而不是总字数,作为计算公式中的总形符数来计算主题性。为此,梁教授开发了专门的工具Keywords Plus。该工具内嵌了改进后的算法,同时将主题词与索引行充分结合。

两天中,60多位代表宣读了自己的研究成果,与专家和其他代表进行了充分而热烈的讨论。各位代表普遍反映,不论主题发言还是分组讨论,都很具有启发性,收获颇丰。

会议在1227日下午4点半左右圆满结束。

From December 26 to 27, 2009, the “First National Symposium on Learner Corpora,” hosted by the National Research Centre for Foreign Language Education at Beijing Foreign Studies University and organized by Foreign Language Teaching and Research Press, was successfully held at the School of Chinese Language and Literature, Beijing Foreign Studies University.

The conference brought together more than 100 experts and scholars from across China engaged in research related to learner corpora. It featured seven keynote presentations by Naixing Wei, Anping He, Wenzhong Li, Jiajin Xu, Jianzhong Pu, Huaqing Hong, and Maocheng Liang, as well as the presentation and discussion of more than 60 papers.  The symposium included extensive discussion of the construction of learner corpora, research methods, and teaching applications. Building on a review and reflection of previous research on learner corpora, the participants showed a strong awareness of innovation and change. For example, a number of scholars were no longer satisfied with traditional corpus research based solely on text. Instead, they used multimedia and web programming technologies to integrate audio, video, and text in order to build multimodal learner corpora. Another example was the expansion of learner corpus analysis from the study of local linguistic units based on lexical items to multidimensional and integrated investigations of lexico-grammatical co-occurrence, semantic preference, and semantic prosody. The many new research ideas, perspectives, and technological approaches generated through the conference discussions pointed to a promising future for learner corpus research in China.

On the morning of December 26, Professor Qiufang Wen, Director of the National Research Centre for Foreign Language Education at Beijing Foreign Studies University, and Ms. Xiaoling Chang, President of the Higher English Education Publishing Branch of Foreign Language Teaching and Research Press, delivered opening remarks, marking the beginning of the symposium.

Afterwards, Professor Naixing Wei of Shanghai Jiao Tong University, President of the Corpus Linguistics Society of China, and Professor Anping He of South China Normal University delivered keynote speeches titled “Phraseology in Contrast: A Multi-dimensional Contrastive Analysis of Learner Phraseologies across Corpora” and “Overuse, Underuse, Misuse—So What?” respectively. Professor Wei first reviewed the development and major research orientations of corpus linguistics and learner corpus studies. He then used examples such as 'bent on' to propose a multidimensional analytical approach to learner phraseology at both the linguistic level, including collocation, colligation, semantic preference, and semantic prosody, and the comparative level, including interlanguage corpora, native-speaker corpora, and bilingual parallel corpora. This approach enhances the layers and depth of analysis and can offer appropriate explanations for issues in learner language use. Professor Anping He’s presentation addressed a prominent problem in learner corpus research: an emphasis on description over interpretation. Using the example of the overuse of 'I think' by Chinese students, Professor He pointed out that, in addition to focusing on frequency, researchers can also observe the surrounding linguistic environment of 'I think', such as its semantic preference and semantic prosody, in order to analyze the pragmatic features of these lexical items and identify the problem that Chinese students tend to express insufficient meaning while using excessive modality.

On the afternoon of December 26, Professor Wenzhong Li of Henan Normal University and Dr. Jiajin Xu of the National Research Centre for Foreign Language Education delivered keynote speeches titled “Constructing a Learner English Portfolio Corpus on the Open Corpus Platform: A Research Proposal” and “Storied Self in Another Language: A Collocational Approach to Interlanguage Identity,” respectively. In response to the fact that most learner corpora currently collect synchronic language output, Professor Li proposed building a dynamic mechanism for collecting student language data based on an online learning system. This model, with online language learning at the front end and formative student corpus collection at the back end, was referred to by Professor Li as an “e-portfolio corpus.” Such a dynamic, real-time learner corpus can serve as an important resource for formative assessment of students. Professor Li also demonstrated a prototype multimodal corpus collection and retrieval platform for the “e-portfolio corpus,” developed by his team, which integrates the collection and retrieval of text, audio, and video. Dr. Xu’s presentation took a different approach. Departing from the common focus of current learner corpus research on diagnosing errors in English learning, he used learner corpora to conduct a sociolinguistic study of the discursive identity of Chinese university English majors. Through the methods of key keywords and concgrams, Dr. Xu constructed an abstract self-image of the Chinese English major student “SWECCL” from three dimensions: subject characteristics, social networks, and interactive force.

On the morning of December 27, Professor Jianzhong Pu of the PLA University of Foreign Languages and Dr. Huaqing Hong of the National Institute of Education, Nanyang Technological University, Singapore, delivered keynote speeches titled “The Development of Learners’ Language Ability as Seen from the Use of Lexical Chunks” and “Multimodal Learner Corpus Construction: Challenges and Directions,” respectively. Drawing on diachronic student data he had collected, Professor Pu conducted a detailed analysis of lexical chunks centered on the words *find*, *life*, and *hard* from the perspectives of collocation, colligation, semantic preference, and semantic prosody. Methodologically, he also employed comparisons of keyword clusters. Professor Pu’s analysis showed that after about two years of study, learners demonstrated little noticeable progress across the various dimensions of lexical chunk use. This, he argued, should attract the attention of educators. Dr. Hong introduced the Singapore Corpus of Research in Education, or SCoRE, which he developed, along with its multilayer annotation and retrieval platform. The corpus fully integrates transcribed texts, classroom videos, and analytical data from linguistic and educational researchers, enabling education researchers to conveniently retrieve learner or teacher interactional behavior in classroom teaching, thereby providing factual evidence for teaching diagnosis and curriculum reform.

On the afternoon of December 27, Professor Maocheng Liang of the National Research Centre for Foreign Language Education delivered a keynote speech titled “Beyond Keywords.” Addressing computational shortcomings in existing keyword tools, Professor Liang improved the method for calculating keywords. He argued that when extracting key clusters and key structures, such as he V-ed that, the frequency of clusters and structures, rather than the total number of words, should be used as the total token count in the calculation formula for keyness. To this end, Professor Liang developed a specialized tool called Keywords Plus. This tool incorporates the improved algorithm and fully integrates keywords with concordance lines.

Over the two days, more than 60 representatives presented their research findings and engaged in thorough and lively discussions with experts and other participants. Participants generally reported that both the keynote speeches and the group discussions were highly inspiring and rewarding.

The conference successfully concluded at around 4:30 p.m. on December 27.