- It is immutable. Once a string is created it can not be modified. Updating the string’s value will end up in creating a new string object having the updated content and reclaiming the old string by the GC (Garbage Collector)
- It is a reference type. Because of the immutability, many people think that the String is a value type. Actually it is a reference type, so a string can be null. Being a reference type is a good think, because we can save memory by sharing same object references for long strings having same content. A null string is not equivalent with an empty string!
- It overloads the == operator. When the == operator is used, the Equals() method is called. This will check first if the compared strings share same object. If is so, the Equals() method will skip checking the content and it will return True. If the two strings are referring different objects, a content based comparison will start. So for the first scenario, the Equals() method is much faster. You can read more about comparing strings here: How to: Optimize the strings’ comparison
Preserve memory
System.String type is used in any .NET application. We have strings as: names, addresses, descriptions, error messages, warnings or even application settings. Each application has to create, compare or format string data. Considering the immutability and the fact that any object can be converted to a string, all the available memory can be swallowed by a huge amount of unwanted string duplicates or unclaimed string objects. Now let's see how a string object should be handled to preserve memory.String literals
Using literals guaranties that strings with same content are using references to same string objects.C# .NET
string literal1 = "STRING";
string literal2 = "STRING";
Console.WriteLine("literal1 = {0}", literal1);
Console.WriteLine("literal2 = {0}", literal2);
if (Object.ReferenceEquals(literal1, literal2))
{
// Are sharing same object...
}
This is where an often overlooked technique called string interning comes into play. Each .NET assembly has an intern pool, which is in essence a collection of unique strings. When your code is compiled, all the string literals you reference in your code are added to this pool. Since many literals in a program tend to appear in multiple places, this conserves memory.
Concatenated literals are using the intern pool too:
C# .NET
// Are sharing the same object!
string string1 = "My" + " " + "STRING";
string string2 = "My STRING";
String constants
The string constants give you same effect because the compiler will replace all constant refecences with the defined string literals.String.Empty vs ""
Use String.Empty rather than "". This is more for speed than memory usage but it is a useful tip. The "" is a literal so will act as a literal: on the first use it is created and for the following uses its reference is returned. Only one instance of "" will be stored in memory no matter how many times we use it! I don't see any memory penalties here. The problem is that each time the "" is used, a comparing loop is executed to check if the "" is already in the intern pool. On the other side, String.Empty is a reference to a "" stored in the .NET Framework memory zone. String.Empty is pointing to same memory address for VB.NET and C# applications. So why search for a reference each time you need "" when you have that reference in String.Empty?C# .NET
// Are NOT sharing same object!
string empty1 = "";
string empty2 = String.Empty;
String = String
If a string is initialized with a precreated string, both will share same object.C# .NET
// Are sharing the same object!
string string1 = "STRING";
string string2 = string1;
Updating any of them will end up in creating two different string objects:
C# .NET
string1 = "UPDATED STRING";
Now string1 is pointing to the new created string ("UPDATED STRING") while string2 is pointing to the old string object ("STRING").
The String.Concat() method
The String.Concat() method is creating new string objects for each call. So, strings created by this method will never share same object even if they have same content:C# .NET
// Two different objects are created!
string concat1 = String.Concat("My", " ", "String");
string concat2 = String.Concat("My", " ", "String");
The StringBuilder class
This is also true for using the StringBuilder class:C# .NET
StringBuilder stringBuilder1 = new StringBuilder();
stringBuilder1.Append("String");
StringBuilder stringBuilder2 = new StringBuilder();
stringBuilder2.Append("String");
// Two different objects are created!
string sb1 = stringBuilder1.ToString();
string sb2 = stringBuilder2.ToString();
String created at run-time
Strings created at run-time don't share same objects:C# .NET
// Two different objects are created!
string runTime1 = Char.ConvertFromUtf32(200);
string runTime2 = Char.ConvertFromUtf32(200);
The String.Intern() method
The strings created at run-time can behave like literals if the String.Intern() method is used. The Intern method uses the intern pool to search for a string equal to the value of argument. If such a string exists, its reference in the intern pool is returned. If the string does not exist, a reference to argument is added to the intern pool, then that reference is returned. Note that searching for a string in the intern pool can be expensive, depending how many strings are in the pool at that time.C# .NET
// Are sharing the same object!
string interned1 = String.Intern(Char.ConvertFromUtf32(200));
string interned2 = String.Intern(Char.ConvertFromUtf32(200));
Keep in mind that interning a string has two unwanted side effects:
- The memory allocated for interned String objects is not likely be released until the Common Language Runtime (CLR) terminates. The reason is that the CLR's reference to the interned String object can persist after your application, or even your application domain, terminates.
- To intern a string, you must first create the string. The memory used by the String object must still be allocated, even though the memory will eventually be garbage collected.
8 comments:
Dude, lay off the Ctrl-B. Makes it very hard to read.
Great article! Thanks for the insight. (and I like the bold anonymous)
This is a fascinating article. Where did you acquire your knowledge of the intricacies of the .NET framework, such as string interning? Could you recommend any further reading on this topic, or similar topics related to the nuts and bolts of the .NET framework?
In what situations would String.Intern() be advisable? Perhaps if you have a really, really large string that you anticipate will be used in the intern pool?
Keep up the awesome work -- I'll look forward to more posts of this caliber.
I don't mind the bold =)
Thanks for sharing this
nice article
regards
Thanks for the helpful post.
@DotNetYuppie : Read CLR via C# by Jeffrey Richter (hardcover or ebook) and you'll find all these and much more ;)
This very good article..very informative.
very well explained...
Post a Comment