首页    期刊浏览 2025年08月13日 星期三
登录注册

文章基本信息

  • 标题:Analysis of string representations for a modern programming language.
  • 作者:Naugler, David
  • 期刊名称:Transactions of the Missouri Academy of Science
  • 印刷版ISSN:0544-540X
  • 出版年度:2005
  • 期号:January
  • 语种:English
  • 出版社:Missouri Academy of Science
  • 摘要:If you are designing a language that will provide only one primitive string type that subsumes character and supports Unicode, what is the best internal representation? Should it be mutable or immutable? Which encoding should it use: UTF-8, UTF-16, UTF-32, some hybrid, or multiple encodings? Should the length be encoded as part of the string, and if so, how? Should the string support list-like head/tail recursive algorithms? Should strings be interned (stored in a global hash table) to save space and provide constant-time equality checks? If so, how should the hashing work? In general, should strings be viewed as suitable data structures for most common text operations, or are they opaque containers that must be converted to some other type (list, vector, deque, etc.) for processing? No perfect solution exists, but I analyze the alternatives and justify the string representation I use for my programming language Rune.
  • 关键词:Programming languages;Software engineering

Analysis of string representations for a modern programming language.


Naugler, David


If you are designing a language that will provide only one primitive string type that subsumes character and supports Unicode, what is the best internal representation? Should it be mutable or immutable? Which encoding should it use: UTF-8, UTF-16, UTF-32, some hybrid, or multiple encodings? Should the length be encoded as part of the string, and if so, how? Should the string support list-like head/tail recursive algorithms? Should strings be interned (stored in a global hash table) to save space and provide constant-time equality checks? If so, how should the hashing work? In general, should strings be viewed as suitable data structures for most common text operations, or are they opaque containers that must be converted to some other type (list, vector, deque, etc.) for processing? No perfect solution exists, but I analyze the alternatives and justify the string representation I use for my programming language Rune.

* Shade, E. Computer Science Department, Southwest Missouri State University.

联系我们|关于我们|网站声明
国家哲学社会科学文献中心版权所有