Comparison of programming languages (strings)
This comparison of programming languages (strings) compares the features of string data structures or text-string processing for over 52 various computer programming languages.
Concatenation
Different languages use different symbols for the concatenation operator. Many languages use the "+" symbol, though several deviate from this.
Common variants
Operator | Languages |
---|---|
+ | ALGOL 68, BASIC, C++, C#, Cobra, Pascal, Object Pascal, Eiffel, Go, JavaScript, Java, Python, Turing, Ruby, Rust, Windows PowerShell, Objective-C, Swift, F#, Scala, Ya |
++ | Haskell, Erlang |
$+ | mIRC Scripting Language |
& | Ada, AppleScript, COBOL (for literals only), Curl, Seed7, VHDL, Visual Basic, Visual Basic .NET, Excel, FreeBASIC |
nconc | Common Lisp |
. | Perl, PHP, and Maple (up to version 5), Autohotkey |
~ | Raku and D |
|| | Icon, Standard SQL, PL/I, Rexx, and Maple (from version 6) |
<> | Mathematica, Wolfram Language |
.. | Lua |
: | Pick Basic |
, | J programming language, Smalltalk, APL |
^ | OCaml, Standard ML, F#, rc |
// | Fortran |
* | Julia |
Unique variants
- Awk uses the empty string: two expressions adjacent to each other are concatenated. This is called juxtaposition. Unix shells have a similar syntax. Rexx uses this syntax for concatenation including an intervening space.
- C (along with Python) allows juxtaposition for string literals, however, for strings stored as character arrays, the
strcat
function must be used. - COBOL uses the
STRING
statement to concatenate string variables. - MATLAB and Octave use the syntax "
[x y]
" to concatenate x and y. - Visual Basic and Visual Basic .NET can also use the "
+
" sign but at the risk of ambiguity if a string representing a number and a number is are together. - Microsoft Excel allows both "
&
" and the function "=CONCATENATE(X,Y)
". - Rust has the
concat!
macro and theformat!
macro, of which the latter is the most prevalent throughout the documentation and examples.
String literals
This section compares styles for declaring a string literal.
Quoted interpolated
An expression is "interpolated" into a string when the compiler/interpreter evaluates it and inserts the result in its place.
Syntax | Language(s) |
---|---|
$"hello, {name}" |
C#, Visual Basic .NET |
"Hello, $name!" |
Bourne shell, Perl, PHP, Windows PowerShell |
qq(Hello, $name!) |
Perl (alternate) |
"Hello, {$name}!" |
PHP (alternate) |
"Hello, #{name}!" |
CoffeeScript, Ruby |
%Q(Hello, #{name}!) |
Ruby (alternate) |
(format t "Hello, ~A" name) |
Common Lisp |
`Hello, ${name}!` |
JavaScript (ECMAScript 6) |
"Hello, \(name)!" |
Swift |
f'Hello, {name}!' |
Python |
Escaped quotes
"Escaped" quotes means that a 'flag' symbol is used to warn that the character after the flag is used in the string rather than ending the string.
Syntax | Language(s) |
---|---|
"I said \"Hello, world!\"" |
C, C++, C#, D, F#, Java, JavaScript, Mathematica, Ocaml, Perl, PHP, Python, Rust, Swift, Wolfram Language, Ya |
'I said \'Hello, world!\'' |
CoffeeScript, JavaScript (alternate), Python (alternate) |
"I said `"Hello, world!`"" |
Windows Powershell |
"I said ^"Hello, world!^"" |
REBOL |
{I said "Hello, world!"} |
REBOL (alternate) |
"I said, %"Hello, World!%"" |
Eiffel |
!"I said \"Hello, world!\"" |
FreeBASIC |
r#"I said "Hello, world!""# |
Rust (alternate) |
Dual quoting
"Dual quoting" means that whenever a quote is used in a string, it is used twice, and one of them is discarded and the single quote is then used within the string.
Syntax | Language(s) |
---|---|
"I said ""Hello, world!""" |
Ada, ALGOL 68, Excel, Fortran, Visual Basic (.NET), FreeBASIC, COBOL |
'I said ''Hello, world!''' |
Fortran, rc, COBOL, SQL, Pascal, Object Pascal, APL, Smalltalk |
Quoted raw
"Raw" means the compiler treats every character within the literal exactly as written, without processing any escapes or interpolations.
Multiline string
Many languages have a syntax specifically intended for strings with multiple lines. In some of these languages, this syntax is a here document or "heredoc": A token representing the string is put in the middle of a line of code, but the code continues after the starting token and the string's content doesn't appear until the next line. In other languages, the string's content starts immediately after the starting token and the code continues after the string literal's terminator.
Syntax | Here document | Language(s) |
---|---|---|
<<EOF I have a lot of things to say and so little time to say them EOF |
Yes | Bourne shell, Perl, PHP, Ruby |
<<<EOF I have a lot of things to say and so little time to say them EOF |
Yes | PHP |
@" I have a lot of things to say and so little time to say them "@ |
No | Windows Powershell |
"[ I have a lot of things to say and so little time to say them ]" |
No | Eiffel |
""" I have a lot of things to say and so little time to say them """ |
No | CoffeeScript, Python, Groovy, Swift, Kotlin |
" I have a lot of things to say and so little time to say them " |
No | Visual Basic .NET (all strings are multiline), Rust (all strings are multiline) |
r" I have a lot of things to say and so little time to say them " |
No | Rust |
[[ I have a lot of things to say and so little time to say them ]] |
No | Lua |
` I have a lot of things to say and so little time to say them ` |
No | JavaScript (ECMAScript 6) |
Unique quoting variants
Syntax | Variant name | Language(s) |
---|---|---|
13HHello, world! |
Hollerith notation | Fortran 66 |
(indented with whitespace) | Indented with whitespace and newlines | YAML |
Notes
- 1. ^
String.raw``
still processes string interpolation.