When should you use struct and not class in C#? My conceptual model is that structs are used in times when the item is merely a collection of value types. A way to logically hold them all together into a cohesive whole.
I came across these rules here [1]:
Do these rules work? What does a struct mean semantically?
The source referenced by the OP has some credibility ...but what about Microsoft - what is the stance on struct usage? I sought some extra learning from Microsoft [1], and here is what I found:
Consider defining a structure instead of a class if instances of the type are small and commonly short-lived or are commonly embedded in other objects.
Do not define a structure unless the type has all of the following characteristics:
- It logically represents a single value, similar to primitive types (integer, double, and so on).
- It has an instance size smaller than 16 bytes.
- It is immutable.
- It will not have to be boxed frequently.
Okay, #2 and #3 anyway. Our beloved dictionary has 2 internal structs:
[StructLayout(LayoutKind.Sequential)] // default for structs
private struct Entry //<Tkey, TValue>
{
// View code at *Reference Source
}
[Serializable, StructLayout(LayoutKind.Sequential)]
public struct Enumerator :
IEnumerator<KeyValuePair<TKey, TValue>>, IDisposable,
IDictionaryEnumerator, IEnumerator
{
// View code at *Reference Source
}
* Reference Source [2]
The 'JonnyCantCode.com' source got 3 out of 4 - quite forgivable since #4 probably wouldn't be an issue. If you find yourself boxing a struct, rethink your architecture.
Let's look at why Microsoft would use these structs:
Entry
and Enumerator
, represent single values.Entry
is never passed as a parameter outside of the Dictionary class. Further investigation shows that in order to satisfy implementation of IEnumerable, Dictionary uses the Enumerator
struct which it copies every time an enumerator is requested ...makes sense.Enumerator
is public because Dictionary is enumerable and must have equal accessibility to the IEnumerator interface implementation - e.g. IEnumerator getter. Update - In addition, realize that when a struct implements an interface - as Enumerator does - and is cast to that implemented type, the struct becomes a reference type and is moved to the heap. Internal to the Dictionary class, Enumerator is still a value type. However, as soon as a method calls GetEnumerator()
, a reference-type IEnumerator
is returned.
What we don't see here is any attempt or proof of requirement to keep structs immutable or maintaining an instance size of only 16 bytes or less:
readonly
- not immutableEntry
has an undetermined lifetime (from Add()
, to Remove()
, Clear()
, or garbage collection);And ... 4. Both structs store TKey and TValue, which we all know are quite capable of being reference types (added bonus info)
Hashed keys notwithstanding, dictionaries are fast in part because instancing a struct is quicker than a reference type. Here, I have a Dictionary<int, int>
that stores 300,000 random integers with sequentially incremented keys.
Capacity: 312874
MemSize: 2660827 bytes
Completed Resize: 5ms
Total time to fill: 889ms
Capacity: number of elements available before the internal array must be resized.
MemSize: determined by serializing the dictionary into a MemoryStream and getting a byte length (accurate enough for our purposes).
Completed Resize: the time it takes to resize the internal array from 150862 elements to 312874 elements. When you figure that each element is sequentially copied via Array.CopyTo()
, that ain't too shabby.
Total time to fill: admittedly skewed due to logging and an OnResize
event I added to the source; however, still impressive to fill 300k integers while resizing 15 times during the operation. Just out of curiosity, what would the total time to fill be if I already knew the capacity? 13ms
So, now, what if Entry
were a class? Would these times or metrics really differ that much?
Capacity: 312874
MemSize: 2660827 bytes
Completed Resize: 26ms
Total time to fill: 964ms
Obviously, the big difference is in resizing. Any difference if Dictionary is initialized with the Capacity? Not enough to be concerned with ... 12ms.
What happens is, because Entry
is a struct, it does not require initialization like a reference type. This is both the beauty and the bane of the value type. In order to use Entry
as a reference type, I had to insert the following code:
/*
* Added to satisfy initialization of entry elements --
* this is where the extra time is spent resizing the Entry array
* **/
for (int i = 0 ; i < prime ; i++)
{
destinationArray[i] = new Entry( );
}
/* *********************************************** */
The reason I had to initialize each array element of Entry
as a reference type can be found at
MSDN: Structure Design
[3]. In short:
Do not provide a default constructor for a structure.
If a structure defines a default constructor, when arrays of the structure are created, the common language runtime automatically executes the default constructor on each array element.
Some compilers, such as the C# compiler, do not allow structures to have default constructors.
It is actually quite simple and we will borrow from Asimov's Three Laws of Robotics [4]:
...what do we take away from this: in short, be responsible with the use of value types. They are quick and efficient, but have the ability to cause many unexpected behaviors if not properly maintained (i.e. unintentional copies).
[1] http://msdn.microsoft.com/en-us/library/ms229017.aspxDecimal
or DateTime
], then if it would not abide by the other three rules, it should be replaced by a class. If a structure holds a fixed collection of variables, each of which may hold any value that would be valid for its type [e.g. Rectangle
], then it should abide by different rules, some of which are contrary to those for "single-value" structs. - supercat
Dictionary
entry type on the basis that it's an internal type only, performance was considered more important than semantics, or some other excuse. My point is that a type like Rectangle
should have its contents exposed as individually-editable fields not "because" the performance benefits outweigh the resulting semantic imperfections, but because the type semantically represents a fixed set of independent values, and so the mutable struct is both more performant and semantically superior. - supercat
Value
type, then following the rule of thumb would infer 4 or fewer members. If the struct is cast to an interface (becoming a Reference
type), then I would completely ignore the rule of thumb personally. This could very well be the reason that the Dictionary
's enumerator has reference type members and is certainly larger than 16 bytes. - IAbstract
Whenever you:
The caveat, however, is that structs (arbitrarily large) are more expensive to pass around than class references (usually one machine word), so classes could end up being faster in practice.
(Guid)null
(it's okay to cast a null to a reference-type), among other things. - user166390
I do not agree with the rules given in the original post. Here are my rules:
You use structs for performance when stored in arrays. (see also When are structs the answer? [1])
You need them in code passing structured data to/from C/C++
Do not use structs unless you need them:
struct
to know how it will behave, but if something is a struct
with exposed fields, that's all one has to know. If an object exposes a property of an exposed-field-struct type, and if code reads that struct to a variable and modifies, one can safely predict that such action will not affect the object whose property was read unless or until the struct is written back. By contrast, if the property were a mutable class type, reading it and modifying it might update the underlying object as expected, but... - supercat
Int32
are implemented as structs instead of classes? - Honinbo Shusaku
Use a struct when you want value semantics as opposed to reference semantics.
If you need reference semantics you need a class not a struct.
record
makes that convenient. Also, it is possible to create a struct that is mutable, violating what is usually required to say it has value semantics. - Marco Eckstein
In addition to the "it is a value" answer, one specific scenario for using structs is when you know that you have a set of data that is causing garbage collection issues, and you have lots of objects. For example, a large list/array of Person instances. The natural metaphor here is a class, but if you have large number of long-lived Person instance, they can end up clogging GEN-2 and causing GC stalls. If the scenario warrants it, one potential approach here is to use an array (not list) of Person structs, i.e. Person[]
. Now, instead of having millions of objects in GEN-2, you have a single chunk on the LOH (I'm assuming no strings etc here - i.e. a pure value without any references). This has very little GC impact.
Working with this data is awkward, as the data is probably over-sized for a struct, and you don't want to copy fat values all the time. However, accessing it directly in an array does not copy the struct - it is in-place (contrast to a list indexer, which does copy). This means lots of work with indexes:
int index = ...
int id = peopleArray[index].Id;
Note that keeping the values themselves immutable will help here. For more complex logic, use a method with a by-ref parameter:
void Foo(ref Person person) {...}
...
Foo(ref peopleArray[index]);
Again, this is in-place - we have not copied the value.
In very specific scenarios, this tactic can be very successful; however, it is a fairly advanced scernario that should be attempted only if you know what you are doing and why. The default here would be a class.
ICustomer
, and had a CustomerRef
struct which implemented that interface, held a single int
index, and acted in appropriate fashion upon the items of the array? I would think that if one made the methods which take a Customer
instead accept an ICustomer
generically it should be possible to get performance comparable to the current approach without having to widely expose the underlying array. - supercat
customers[index]
to customers[index >> 16][index & 65535]
without having to affect anything outside CustomerRef
. - supercat
ICustomer
approach would allow one to migrate away from using a monolithic array should the need arise (the 2GB limit was one reason why it might do so, but not the only one). Incidentally, one thing your blog doesn't mention as a "cost" of your approach is that the GC has no way of knowing which array slots have references to them. That reduces the cost of GC, but means it may be necessary for an application to track such things itself. - supercat
List
I believe , uses an Array
behind scenes. no ? - Royi Namir
From the C# Language specification [1]:
1.7 Structs
Like classes, structs are data structures that can contain data members and function members, but unlike classes, structs are value types and do not require heap allocation. A variable of a struct type directly stores the data of the struct, whereas a variable of a class type stores a reference to a dynamically allocated object. Struct types do not support user-specified inheritance, and all struct types implicitly inherit from type object.
Structs are particularly useful for small data structures that have value semantics. Complex numbers, points in a coordinate system, or key-value pairs in a dictionary are all good examples of structs. The use of structs rather than classes for small data structures can make a large difference in the number of memory allocations an application performs. For example, the following program creates and initializes an array of 100 points. With Point implemented as a class, 101 separate objects are instantiated—one for the array and one each for the 100 elements.
class Point
{
public int x, y;
public Point(int x, int y) {
this.x = x;
this.y = y;
}
}
class Test
{
static void Main() {
Point[] points = new Point[100];
for (int i = 0; i < 100; i++) points[i] = new Point(i, i);
}
}
An alternative is to make Point a struct.
struct Point
{
public int x, y;
public Point(int x, int y) {
this.x = x;
this.y = y;
}
}
Now, only one object is instantiated—the one for the array—and the Point instances are stored in-line in the array.
Struct constructors are invoked with the new operator, but that does not imply that memory is being allocated. Instead of dynamically allocating an object and returning a reference to it, a struct constructor simply returns the struct value itself (typically in a temporary location on the stack), and this value is then copied as necessary.
With classes, it is possible for two variables to reference the same object and thus possible for operations on one variable to affect the object referenced by the other variable. With structs, the variables each have their own copy of the data, and it is not possible for operations on one to affect the other. For example, the output produced by the following code fragment depends on whether Point is a class or a struct.
Point a = new Point(10, 10);
Point b = a;
a.x = 20;
Console.WriteLine(b.x);
If Point is a class, the output is 20 because a and b reference the same object. If Point is a struct, the output is 10 because the assignment of a to b creates a copy of the value, and this copy is unaffected by the subsequent assignment to a.x.
The previous example highlights two of the limitations of structs. First, copying an entire struct is typically less efficient than copying an object reference, so assignment and value parameter passing can be more expensive with structs than with reference types. Second, except for ref and out parameters, it is not possible to create references to structs, which rules out their usage in a number of situations.
[1] http://msdn.microsoft.com/en-us/library/ms228593.aspxref
to a mutable struct and know that any mutations that outside method will perform on it will be done before it returns. It's too bad .net doesn't have any concept of ephemeral parameters and function return values, since... - supercat
ref
to be achieved with class objects. Essentially, local variables, parameters, and function return values could be persistable (default), returnable, or ephemeral. Code would be forbidden from copying ephemeral things to anything that would outlive the present scope. Returnable things would be like ephemeral things except that they could be returned from a function. The return value of a function would be bound by the tightest restrictions applicable to any of its "returnable" parameters. - supercat
Here is a basic rule.
If all member fields are value types create a struct.
If any one member field is a reference type, create a class. This is because the reference type field will need the heap allocation anyway.
Exmaples
public struct MyPoint
{
public int X; // Value Type
public int Y; // Value Type
}
public class MyPointWithName
{
public int X; // Value Type
public int Y; // Value Type
public string Name; // Reference Type
}
string
are semantically equivalent to values, and storing a reference to an immutable object into a field does not entail a heap allocation. The difference between a struct with exposed public fields and a class object with exposed public fields is that given the code sequence var q=p; p.X=4; q.X=5;
, p.X
will have the value 4 if a
is a structure type, and 5 if it's a class type. If one wishes to be able to conveniently modify the members of the type, one should select 'class' or 'struct' based upon whether one wants changes to q
to affect p
. - supercat
ArraySegment<T>
encapsulates a T[]
, which is always a class type. Structure type KeyValuePair<TKey,TValue>
is often used with class types as the generic parameters. - supercat
BitVector32
is broken per your example as it uses a 32-byte array which is a reference type. This is not a hard and fast way to determine whether you need a struct. - IAbstract
struct
actually improve performance but that should be verified by profiling and shipped with unit tests showing you not altered the functionality of the program. If you want to include part of this comment in the answer as edit feel free to do that ^^ - CoffeDeveloper
Structs are good for atomic representation of data, where the said data can be copied multiple times by the code. Cloning an object is in general more expensive than copying a struct, as it involves allocating the memory, running the constructor and deallocating/garbage collection when done with it.
First: Interop scenarios or when you need to specify the memory layout
Second: When the data is almost the same size as a reference pointer anyway.
I made a small benchmark with BenchmarkDotNet [1] to get a better understanding of "struct" benefit in numbers. I'm testing looping through array (or list) of structs (or classes). Creating those arrays or lists is out of the benchmark's scope - it is clear that "class" is more heavy will utilize more memory, and will involve GC.
So the conclusion is: be careful with LINQ and hidden structs boxing/unboxing and using structs for microoptimizations strictly stay with arrays.
P.S. Another benchmark about passing struct/class through call stack is there https://stackoverflow.com/a/47864451/506147
BenchmarkDotNet=v0.10.8, OS=Windows 10 Redstone 2 (10.0.15063)
Processor=Intel Core i5-2500K CPU 3.30GHz (Sandy Bridge), ProcessorCount=4
Frequency=3233542 Hz, Resolution=309.2584 ns, Timer=TSC
[Host] : Clr 4.0.30319.42000, 64bit RyuJIT-v4.7.2101.1
Clr : Clr 4.0.30319.42000, 64bit RyuJIT-v4.7.2101.1
Core : .NET Core 4.6.25211.01, 64bit RyuJIT
Method | Job | Runtime | Mean | Error | StdDev | Min | Max | Median | Rank | Gen 0 | Allocated |
---------------- |----- |-------- |----------:|----------:|----------:|----------:|----------:|----------:|-----:|-------:|----------:|
TestListClass | Clr | Clr | 5.599 us | 0.0408 us | 0.0382 us | 5.561 us | 5.689 us | 5.583 us | 3 | - | 0 B |
TestArrayClass | Clr | Clr | 2.024 us | 0.0102 us | 0.0096 us | 2.011 us | 2.043 us | 2.022 us | 2 | - | 0 B |
TestListStruct | Clr | Clr | 8.427 us | 0.1983 us | 0.2204 us | 8.101 us | 9.007 us | 8.374 us | 5 | - | 0 B |
TestArrayStruct | Clr | Clr | 1.539 us | 0.0295 us | 0.0276 us | 1.502 us | 1.577 us | 1.537 us | 1 | - | 0 B |
TestLinqClass | Clr | Clr | 13.117 us | 0.1007 us | 0.0892 us | 13.007 us | 13.301 us | 13.089 us | 7 | 0.0153 | 80 B |
TestLinqStruct | Clr | Clr | 28.676 us | 0.1837 us | 0.1534 us | 28.441 us | 28.957 us | 28.660 us | 9 | - | 96 B |
TestListClass | Core | Core | 5.747 us | 0.1147 us | 0.1275 us | 5.567 us | 5.945 us | 5.756 us | 4 | - | 0 B |
TestArrayClass | Core | Core | 2.023 us | 0.0299 us | 0.0279 us | 1.990 us | 2.069 us | 2.013 us | 2 | - | 0 B |
TestListStruct | Core | Core | 8.753 us | 0.1659 us | 0.1910 us | 8.498 us | 9.110 us | 8.670 us | 6 | - | 0 B |
TestArrayStruct | Core | Core | 1.552 us | 0.0307 us | 0.0377 us | 1.496 us | 1.618 us | 1.552 us | 1 | - | 0 B |
TestLinqClass | Core | Core | 14.286 us | 0.2430 us | 0.2273 us | 13.956 us | 14.678 us | 14.313 us | 8 | 0.0153 | 72 B |
TestLinqStruct | Core | Core | 30.121 us | 0.5941 us | 0.5835 us | 28.928 us | 30.909 us | 30.153 us | 10 | - | 88 B |
Code:
[RankColumn, MinColumn, MaxColumn, StdDevColumn, MedianColumn]
[ClrJob, CoreJob]
[HtmlExporter, MarkdownExporter]
[MemoryDiagnoser]
public class BenchmarkRef
{
public class C1
{
public string Text1;
public string Text2;
public string Text3;
}
public struct S1
{
public string Text1;
public string Text2;
public string Text3;
}
List<C1> testListClass = new List<C1>();
List<S1> testListStruct = new List<S1>();
C1[] testArrayClass;
S1[] testArrayStruct;
public BenchmarkRef()
{
for(int i=0;i<1000;i++)
{
testListClass.Add(new C1 { Text1= i.ToString(), Text2=null, Text3= i.ToString() });
testListStruct.Add(new S1 { Text1 = i.ToString(), Text2 = null, Text3 = i.ToString() });
}
testArrayClass = testListClass.ToArray();
testArrayStruct = testListStruct.ToArray();
}
[Benchmark]
public int TestListClass()
{
var x = 0;
foreach(var i in testListClass)
{
x += i.Text1.Length + i.Text3.Length;
}
return x;
}
[Benchmark]
public int TestArrayClass()
{
var x = 0;
foreach (var i in testArrayClass)
{
x += i.Text1.Length + i.Text3.Length;
}
return x;
}
[Benchmark]
public int TestListStruct()
{
var x = 0;
foreach (var i in testListStruct)
{
x += i.Text1.Length + i.Text3.Length;
}
return x;
}
[Benchmark]
public int TestArrayStruct()
{
var x = 0;
foreach (var i in testArrayStruct)
{
x += i.Text1.Length + i.Text3.Length;
}
return x;
}
[Benchmark]
public int TestLinqClass()
{
var x = testListClass.Select(i=> i.Text1.Length + i.Text3.Length).Sum();
return x;
}
[Benchmark]
public int TestLinqStruct()
{
var x = testListStruct.Select(i => i.Text1.Length + i.Text3.Length).Sum();
return x;
}
}
[1] https://github.com/dotnet/BenchmarkDotNetYou need to use a "struct" in situations where you want to explicitly specify memory layout using the StructLayoutAttribute [1] - typically for PInvoke.
Edit: Comment points out that you can use class or struct with StructLayoutAttribute and that is certainly true. In practice, you would typically use a struct - it is allocated on the stack vs the heap which makes sense if you are just passing an argument to an unmanaged method call.
[1] http://msdn.microsoft.com/en-us/library/system.runtime.interopservices.structlayoutattribute.aspxMYTH #1: STRUCTS ARE LIGHTWEIGHT CLASSES
This myth comes in a variety of forms. Some people believe that value types can’t or shouldn’t have methods or other significant behavior—they should be used as simple data transfer types, with just public fields or simple properties. The DateTime type is a good counterexample to this: it makes sense for it to be a value type, in terms of being a fundamental unit like a number or a character, and it also makes sense for it to be able to perform calculations based on its value. Looking at things from the other direction, data transfer types should often be reference types anyway—the decision should be based on the desired value or reference type semantics, not the simplicity of the type. Other people believe that value types are “lighter” than reference types in terms of performance. The truth is that in some cases value types are more performant— they don’t require garbage collection unless they’re boxed, don’t have the type identification overhead, and don’t require dereferencing, for example. But in other ways, reference types are more performant—parameter passing, assigning values to variables, returning values, and similar operations only require 4 or 8 bytes to becopied (depending on whether you’re running the 32-bit or 64-bit CLR) rather than copying all the data. Imagine if ArrayList were somehow a “pure” value type, and passing an ArrayList expression to a method involved copying all its data! In almost all cases, performance isn’t really determined by this sort of decision anyway. Bottlenecks are almost never where you think they’ll be, and before you make a design decision based on performance, you should measure the different options. It’s worth noting that the combination of the two beliefs doesn’t work either. It doesn’t matter how many methods a type has (whether it’s a class or a struct)—the memory taken per instance isn’t affected. (There’s a cost in terms of the memory taken up for the code itself, but that’s incurred once rather than for each instance.)
MYTH #2: REFERENCE TYPES LIVE ON THE HEAP; VALUE TYPES LIVE ON THE STACK
This one is often caused by laziness on the part of the person repeating it. The first part is correct—an instance of a reference type is always created on the heap. It’s the second part that causes problems. As I’ve already noted, a variable’s value lives wherever it’s declared, so if you have a class with an instance variable of type int, that variable’s value for any given object will always be where the rest of the data for the object is—on the heap. Only local variables (variables declared within methods) and method parameters live on the stack. In C# 2 and later, even some local variables don’t really live on the stack, as you’ll see when we look at anonymous methods in chapter 5. ARE THESE CONCEPTS RELEVANT NOW? It’s arguable that if you’re writing managed code, you should let the runtime worry about how memory is best used. Indeed, the language specification makes no guarantees about what lives where; a future runtime may be able to create some objects on the stack if it knows it can get away with it, or the C# compiler could generate code that hardly uses the stack at all. The next myth is usually just a terminology issue.
MYTH #3: OBJECTS ARE PASSED BY REFERENCE IN C# BY DEFAULT
This is probably the most widely propagated myth. Again, the people who make this claim often (though not always) know how C# actually behaves, but they don’t know what “pass by reference” really means. Unfortunately, this is confusing for people who do know what it means. The formal definition of pass by reference is relatively complicated, involving l-values and similar computer-science terminology, but the important thing is that if you pass a variable by reference, the method you’re calling can change the value of the caller’s variable by changing its parameter value. Now, remember that the value of a reference type variable is the reference, not the object itself. You can change the contents of the object that a parameter refers to without the parameter itself being passed by reference. For instance, the following method changes the contents of the StringBuilder object in question, but the caller’s expression will still refer to the same object as before:
void AppendHello(StringBuilder builder)
{
builder.Append("hello");
}
When this method is called, the parameter value (a reference to a StringBuilder) is passed by value. If you were to change the value of the builder variable within the method—for example, with the statement builder = null;—that change wouldn’t be seen by the caller, contrary to the myth. It’s interesting to note that not only is the “by reference” bit of the myth inaccurate, but so is the “objects are passed” bit. Objects themselves are never passed, either by reference or by value. When a reference type is involved, either the variable is passed by reference or the value of the argument (the reference) is passed by value. Aside from anything else, this answers the question of what happens when null is used as a by-value argument—if objects were being passed around, that would cause issues, as there wouldn’t be an object to pass! Instead, the null reference is passed by value in the same way as any other reference would be. If this quick explanation has left you bewildered, you might want to look at my article, “Parameter passing in C#,” (http://mng.bz/otVt), which goes into much more detail. These myths aren’t the only ones around. Boxing and unboxing come in for their fair share of misunderstanding, which I’ll try to clear up next.
Reference: C# in Depth 3rd Edition by Jon Skeet
I use structs for packing or unpacking any sort of binary communication format. That includes reading or writing to disk, DirectX vertex lists, network protocols, or dealing with encrypted/compressed data.
The three guidelines you list haven't been useful for me in this context. When I need to write out four hundred bytes of stuff in a Particular Order, I'm gonna define a four-hundred-byte struct, and I'm gonna fill it with whatever unrelated values it's supposed to have, and I'm going to set it up whatever way makes the most sense at the time. (Okay, four hundred bytes would be pretty strange-- but back when I was writing Excel files for a living, I was dealing with structs of up to about forty bytes all over, because that's how big some of the BIFF records ARE.)
With the exception of the valuetypes that are used directly by the runtime and various others for PInvoke purposes, you should only use valuetypes in 2 scenarios.
this
parameter used to invoke its methods); classes allow one to duplicate references. - supercat
.NET supports value types
and reference types
(in Java, you can define only reference types). Instances of reference types
get allocated in the managed heap and are garbage collected when there are no outstanding references to them. Instances of value types
, on the other hand, are allocated in the stack
, and hence allocated memory is reclaimed as soon as their scope ends. And of course, value types
get passed by value, and reference types
by reference. All C# primitive data types, except for System.String, are value types.
When to use struct over class,
In C#, structs
are value types
, classes are reference types
. You can create value types, in C#, using the enum
keyword and the struct
keyword. Using a value type
instead of a reference type
will result in fewer objects on the managed heap, which results in lesser load on the garbage collector (GC), less frequent GC cycles, and consequently better performance. However, value types
have their downsides too. Passing around a big struct
is definitely costlier than passing a reference, that's one obvious problem. The other problem is the overhead associated with boxing/unboxing
. In case you're wondering what boxing/unboxing
mean, follow these links for a good explanation on boxing
and unboxing
. Apart from performance, there are times when you simply need types to have value semantics, which would be very difficult (or ugly) to implement if reference types
are all you have. You should use value types
only, When you need copy semantics or need automatic initialization, normally in arrays
of these types.
ref
. Passing any size structure by ref
costs the same as passing a class reference by value. Copying any size structure or passing by value is cheaper than performing a defensive copy of a class object and storing or passing a reference to that. The big times classes are better than structs for storing values are (1) when the classes are immutable (so as to avoid defensive copying), and each instance which is created will be passed around a lot, or... - supercat
readOnlyStruct.someMember = 5;
is not to make someMember
a read-only property, but instead make it a field. - supercat
A struct is a value type. If you assign a struct to a new variable, the new variable will contain a copy of the original.
public struct IntStruct {
public int Value {get; set;}
}
Excecution of the following results in 5 instances of the struct stored in memory:
var struct1 = new IntStruct() { Value = 0 }; // original
var struct2 = struct1; // A copy is made
var struct3 = struct2; // A copy is made
var struct4 = struct3; // A copy is made
var struct5 = struct4; // A copy is made
// NOTE: A "copy" will occur when you pass a struct into a method parameter.
// To avoid the "copy", use the ref keyword.
// Although structs are designed to use less system resources
// than classes. If used incorrectly, they could use significantly more.
A class is a reference type. When you assign a class to a new variable, the variable contains a reference to the original class object.
public class IntClass {
public int Value {get; set;}
}
Excecution of the following results in only one instance of the class object in memory.
var class1 = new IntClass() { Value = 0 };
var class2 = class1; // A reference is made to class1
var class3 = class2; // A reference is made to class1
var class4 = class3; // A reference is made to class1
var class5 = class4; // A reference is made to class1
Structs may increase the likelihood of a code mistake. If a value object is treated like a mutable reference object, a developer may be surprised when changes made are unexpectedly lost.
var struct1 = new IntStruct() { Value = 0 };
var struct2 = struct1;
struct2.Value = 1;
// At this point, a developer may be surprised when
// struct1.Value is 0 and not 1
Structure types in C# or other .net languages are generally used to hold things that should behave like fixed-sized groups of values. A useful aspect of structure types is that the fields of a structure-type instance can be modified by modifying the storage location in which it is held, and in no other way. It's possible to code a structure in such a way that the only way to mutate any field is to construct a whole new instance and then use a struct assignment to mutate all the fields of the target by overwriting them with values from the new instance, but unless a struct provides no means of creating an instance where its fields have non-default values, all of its fields will be mutable if and if the struct itself is stored in a mutable location.
Note that it's possible to design a structure type so that it will essentially behave like a class type, if the structure contains a private class-type field, and redirects its own members to that of the wrapped class object. For example, a PersonCollection
might offer properties SortedByName
and SortedById
, both of which hold an "immutable" reference to a PersonCollection
(set in their constructor) and implement GetEnumerator
by calling either creator.GetNameSortedEnumerator
or creator.GetIdSortedEnumerator
. Such structs would behave much like a reference to a PersonCollection
, except that their GetEnumerator
methods would be bound to different methods in the PersonCollection
. One could also have a structure wrap a portion of an array (e.g. one could define an ArrayRange<T>
structure which would hold a T[]
called Arr
, an int Offset
, and an int Length
, with an indexed property which, for an index idx
in the range 0 to Length-1
, would access Arr[idx+Offset]
). Unfortunately, if foo
is a read-only instance of such a structure, current compiler versions won't allow operations like foo[3]+=4;
because they have no way to determine whether such operations would attempt to write to fields of foo
.
It's also possible to design a structure to behave a like a value type which holds a variable-sized collection (which will appear to be copied whenever the struct is) but the only way to make that work is to ensure that no object to which the struct holds a reference will ever be exposed to anything which might mutate it. For example, one could have an array-like struct which holds a private array, and whose indexed "put" method creates a new array whose content is like that of the original except for one changed element. Unfortunately, it can be somewhat difficult to make such structs perform efficiently. While there are times that struct semantics can be convenient (e.g. being able to pass an array-like collection to a routine, with the caller and callee both knowing that outside code won't modify the collection, may be better than requiring both caller and callee to defensively copy any data they're given), the requirement that class references point to objects that will never be mutated is often a pretty severe constraint.
Nah - I don't entirely agree with the rules. They are good guidelines to consider with performance and standardization, but not in light of the possibilities.
As you can see in the responses, there are a lot of creative ways to use them. So, these guidelines need to just be that, always for the sake of performance and efficiency.
In this case, I use classes to represent real world objects in their larger form, I use structs to represent smaller objects that have more exact uses. The way you said it, "a more cohesive whole." The keyword being cohesive. The classes will be more object oriented elements, while structs can have some of those characteristics, though on a smaller scale. IMO.
I use them a lot in Treeview and Listview tags where common static attributes can be accessed very quickly. I have always struggled to get this info another way. For example, in my database applications, I use a Treeview where I have Tables, SPs, Functions, or any other objects. I create and populate my struct, put it in the tag, pull it out, get the data of the selection and so forth. I wouldn't do this with a class!
I do try and keep them small, use them in single instance situations, and keep them from changing. It's prudent to be aware of memory, allocation, and performance. And testing is so necessary.
double
values for those coordinates, such a spec would compel it to behave semantically identically to an exposed-field struct except for some details of multi-threaded behavior (the immutable class would be better in some cases, while the exposed-field struct would be better in others; a so-called "immutable" struct would be worse in every case). - supercat
My rule is
1, Always use class;
2, If there is any performance issue, I try to change some class to struct depending on the rules which @IAbstract mentioned, and then do a test to see if these changes can improve performance.
Foo
to encapsulate a fixed collection of independent values (e.g. coordinates of a point) which one will sometimes want to pass around as a group and sometimes want to change independently. I've not found any pattern for using classes which combines both purposes nearly as nicely as a simple exposed-field struct (which, being a fixed collection of independent variables, fits the bill perfectly). - supercat
public readonly
fields in my types, too, because creating read-only properties are simply too much work for practically no benefit.) - stakx - no longer contributing
MyListOfPoint[3].Offset(2,3);
into var temp=MyListOfPoint[3]; temp.Offset(2,3);
, a transform which is bogus when applied... - supercat
Offset
method. The proper way to prevent such bogus code shouldn't be make structs needlessly immutable, but instead to allow methods like Offset
to be tagged with an attribute forbidding the aforementioned transform. Implicit numerical conversions too could have been much better if they could be tagged so as to be applicable only in cases where their invocation would be obvious. If overloads exist for foo(float,float)
and foo(double,double)
, I would posit that trying to use a float
and a double
often shouldn't apply an implicit conversion, but should instead be an error. - supercat
double
value to a float
, or passing it to a method which can take a float
argument but not double
, would almost always do what the programmer intended. By contrast, assigning float
expression to double
without an explicit typecast is often a mistake. The only time allowing implicit double->float
conversion would cause problems would be when it would cause a less-than-ideal overload to be selected. I'd posit that the right way to prevent that shouldn't have been forbidding implcit double->float, but tagging overloads with attributes to disallow conversion. - supercat
A class is a reference type. When an object of the class is created, the variable to which the object is assigned holds only a reference to that memory. When the object reference is assigned to a new variable, the new variable refers to the original object. Changes made through one variable are reflected in the other variable because they both refer to the same data. A struct is a value type. When a struct is created, the variable to which the struct is assigned holds the struct's actual data. When the struct is assigned to a new variable, it is copied. The new variable and the original variable therefore contain two separate copies of the same data. Changes made to one copy do not affect the other copy. In general, classes are used to model more complex behavior, or data that is intended to be modified after a class object is created. Structs are best suited for small data structures that contain primarily data that is not intended to be modified after the struct is created.
Classes and Structs (C# Programming Guide) [1]
[1] http://msdn.microsoft.com/en-us/library/ms173109.aspxI was just dealing with Windows Communication Foundation [WCF] Named Pipe and I did notice that it does make sense to use Structs in order to ensure that exchange of data is of value type instead of reference type.
Briefly, use struct if:
your object properties/fields do not need to be changed. I mean you just want to give them an initial value and then read them.
properties and fields in your object are value type and they are not so large.
If that's the case, you can take advantage of structs for a better performance and optimized memory allocation as they use only stacks rather than both stacks and heaps (in classes)
The C# struct is a lightweight alternative to a class. It can do almost the same as a class, but it's less "expensive" to use a struct rather than a class. The reason for this is a bit technical, but to sum up, new instances of a class is placed on the heap, where newly instantiated structs are placed on the stack. Furthermore, you are not dealing with references to structs, like with classes, but instead you are working directly with the struct instance. This also means that when you pass a struct to a function, it is by value, instead of as a reference. There is more about this in the chapter about function parameters.
So, you should use structs when you wish to represent more simple data structures, and especially if you know that you will be instantiating lots of them. There are lots of examples in the .NET framework, where Microsoft has used structs instead of classes, for instance the Point, Rectangle and Color struct.
Following are the rules defined at Microsoft website:
✔️ CONSIDER defining a struct instead of a class if instances of the type are small and commonly short-lived or are commonly embedded in other objects.
❌ AVOID defining a struct unless the type has all of the following characteristics:
It logically represents a single value, similar to primitive types (int, double, etc.).
It has an instance size under 16 bytes.
It is immutable.
It will not have to be boxed frequently.
for further reading [1]
[1] https://learn.microsoft.com/en-us/dotnet/standard/design-guidelines/choosing-between-class-and-struct?redirectedfrom=MSDNLet me add another aspect besides the commonly cited performance difference and that is the intention revealing usage of default values.
Do not use a struct if the default values of its fields do not represent a sensible default value of the modeled concept.
Eg.
If you implement a concept with a class then you can enforce certain invariants, eg. that a person must have a first name and a last name. But with a struct it is always possible to create an instance with all of its fields set to their default values.
So when modeling a concept that has no sensible default value prefer a class. The users of your class will understand that null means that a PersonName is not specified but they will be confused if you hand them a PersonName struct instance with all of its properties set to null.
(Usual disclaimer: performance considerations may override this advice. If you have performance concerns always measure before deciding on a solution. Try BenchmarkDotNet [1] it's awsome!)
[1] https://benchmarkdotnet.org/articles/overview.htmlI rarely use a struct for things. But that's just me. It depends whether I need the object to be nullable or not.
As stated in other answers, I use classes for real-world objects. I also have the mindset of structs are used for storing small amounts of data.
Struct can be used to improve garbage collection performance. While you usually don't have to worry about GC performance, there are scenarios where it can be a killer. Like large caches in low latency applications. See this post for an example:
http://00sharp.wordpress.com/2013/07/03/a-case-for-the-struct/
I think a good first approximation is "never".
I think a good second approximation is "never".
If you are desperate for perf, consider them, but then always measure.
✔️ CONSIDER Struct Usage
Classes are best suited for grouping together complex actions and data that will change throughout a program; structs are a better choice for simple objects and data that will remain constant for the most part. Besides their uses, they are fundamentally different in one key area—that is, how they are passed or assigned between variables. Classes are reference types, meaning that they are passed by reference; structs are value types, meaning that they are passed by value.
When a struct object is created, all of its data is stored in its corresponding variable with no references or connections to its memory location. This makes structs useful for creating objects that need to be copied quickly and efficiently, while still retaining their separate identities.
ExampleStruct struct1= new ExampleStruct()
ExampleStruct struct2= struct1
modifying struct2
wont affect struct1
.
Basically, structs are created to increase performance. But, sometimes structs may be slower because of all the copying involved. If your struct has lots of variables that need to be copied converting it to a class and just passing references around may be faster
If you have an array of structs
, the array itself is an object on the heap and struct values are contained in the array. So garbage collector
only has one object to consider. If the array goes out of the scope, the garbage collector can deallocate all the structs inside the array in one step. If any other part of your code is using structs from this array, since structs are copied we can safely deallocate the array itself and its contents.
If you have an array of objects
, the array itself and each object in the array are separe objects on the heap. Each object could be stored in a totally different part of the heap and another part of your code might have references to those objects. So when our array goes out of scope, we cannot deallocate the array right away. Because the garbage collector
has to consider each object individually and make sure there are no references to each object before de-allocating them.
Structures are in most ways like classes/objects. Structure can contain functions, members and can be inherited. But structures are in C# used just for data holding. Structures does take less RAM than classes and are easier for garbage collector to collect. But when you use functions in your structure, then compiler actually takes that structure very similarly as class/object, so if you want something with functions, then use class/object.
System.Drawing.Rectangle
violates all three of these rules. - ChrisWRectangle
is that its members are properties rather than fields. Use of struct properties when fields would have sufficed is probably 95% responsible for the "mutable structs are evil" notion (there are legitimate uses for property setters on read-only struct instances; because of that, C# only recently started forbidding their use (shutting out the legitimate use cases); structs which wrapped members in read-write properties effectively turned code that would and should have generated compiler errors into code which would compile but not work. - supercatSystem.Drawing.Rectangle
represent a single value? Could you please explain this? - Marson MaoRectangle
: something that most definitely should be a struct, yet violates all of these. Thus they are not good conditions. - Wolfzoon