英文原文:“Learn How to Avoid The 10 Most Common C# Mistakes”
About C#
C# is one of several languages that target the Microsoft Common Language Runtime (CLR). Languages that target the CLR benefit from features such as cross-language integration and exception handling, enhanced security, a simplified model for component interaction, and debugging and profiling services. Of today’s CLR languages, C# is the most widely used for complex, professional development projects that target the Windows desktop, mobile, or server environments.
C# is an object oriented, strongly-typed language. The strict type checking in C#, both at compile and run times, results in the majority of typical programming errors being reported as early as possible, and their locations pinpointed quite accurately. This can save the C# programmer a lot of time, compared to tracking down the cause of puzzling errors which can occur long after the offending operation takes place in languages which are more liberal with their enforcement of type safety. However, a lot of programmers unwittingly (or carelessly) throw away the benefits of this detection, which leads to some of the issues discussed in this C# tutorial.
About this Tutorial
This tutorial describes 10 of the most common programming mistakes made, or problems to be avoided, by C# programmers and provide them with help.
While most of the mistakes discussed in this article are C# specific, some are also relevant to other languages that target the CLR or make use of the Framework Class Library (FCL).
Common Mistake #1: Using a reference like a value or vice versa
Programmers of C++, and many other languages, are accustomed to being in control of whether the values they assign to variables are simply values or are references to existing objects. In C#, however, that decision is made by the programmer who wrote the object, not by the programmer who instantiates the object and assigns it to a variable. This is a common “gotcha” for newbie C# programmers.
If you don’t know whether the object you’re using is a value type or reference type, you could run into some surprises. For example:
Point point1 = new Point(20, 30);
Point point2 = point1;
point2.X = 50;
Console.WriteLine(point1.X); // 20 (does this surprise you?)
Console.WriteLine(point2.X); // 50
Pen pen1 = new Pen(Color.Black);
Pen pen2 = pen1;
pen2.Color = Color.Blue;
Console.WriteLine(pen1.Color); // Blue (or does this surprise you?)
Console.WriteLine(pen2.Color); // Blue
As you can see, both the Point
and Pen
objects
were created the exact same way, but the value of point1
remained
unchanged when a new X
coordinate
value was assigned to point2
,
whereas the value of pen1
was modified
when a new color was assigned to pen2
.
We can therefore deduce that point1
and point2
each
contain their own copy of a Point
object,
whereas pen1
and pen2
contain
references to the same Pen
object.But
how can we know that without doing this experiment?
The answer is to look at the definitions of the object types (which you can easily do in Visual Studio by placing your cursor over the name of the object type and pressing F12):
public struct Point { … } // defines a “value” type
public class Pen { … } // defines a “reference” type
As shown above, in C#, the struct
keyword
is used to define a value type, while the class
keyword
is used to define a reference type. For those with a C++ background, who were lulled into a false sense of security by the many similarities between C++
and C# keywords, this behavior likely comes as a surprise that may have you asking for help from a C# tutorial.
If you’re going to depend on some behavior which differs between value and reference types – such as the ability to pass an object as a method parameter and have that method change the state of the object – make sure that you’re dealing with the correct type of object to avoid C# problems.
Common Mistake #2: Misunderstanding default values for uninitialized variables
In C#, value types can’t be null. By definition, value types have a value, and even uninitialized variables of value types must have a value. This is called the default value for that type. This leads to the following, usually unexpected result when checking if a variable is uninitialized:
class Program {
static Point point1;
static Pen pen1;
static void Main(string[] args) {
Console.WriteLine(pen1 == null); // True
Console.WriteLine(point1 == null); // False (huh?)
}
}
Why isn’t point1
null?
The answer is that Point
is
a value type, and the default value for a Point
is
(0,0), not null. Failure to recognize this is a very easy (and common) mistake to make in C#.
Many (but not all) value types have an IsEmpty
property
which you can check to see if it is equal to its default value:
Console.WriteLine(point1.IsEmpty); // True
When you’re checking to see if a variable has been initialized or not, make sure you know what value an uninitialized variable of that type will have by default and don’t rely on it being null..
Common Mistake #3: Using improper or unspecified string comparison methods
There are many different ways to compare strings in C#.
Although many programmers use the ==
operator
for string comparison, it is actually one of the leastdesirable methods to employ, primarily because it doesn’t specify explicitly in the code
which type of comparison is wanted.
Rather, the preferred way to test for string equality in C# is with the Equals
method:
public bool Equals(string value);
public bool Equals(string value, StringComparison comparisonType);
The first method signature (i.e., without the comparisonType
parameter),
is actually the same as using the ==
operator,
but has the benefit of being explicitly applied to strings. It performs an ordinal comparison of the strings, which is basically a byte-by-byte comparison. In many cases this is exactly the type of comparison you want, especially when comparing strings whose
values are set programmatically, such as file names, environment variables, attributes, etc. In these cases, as long as an ordinal comparison is indeed the correct type of comparison for that situation, the only downside to using the Equals
method
without a comparisonType
is
that somebody reading the code may not know what type of comparison you’re making.
Using the Equals
method
signature that includes a comparisonType
every
time you compare strings, though, will not only make your code clearer, it will make you explicitly think about which type of comparison you need to make. This is a worthwhile thing to do, because even if English may not provide a whole lot of differences
between ordinal and culture-sensitive comparisons, other languages provide plenty, and ignoring the possibility of other languages is opening yourself up to a lot of potential for errors down the road. For example:
string s = "strasse";
// outputs False:
Console.WriteLine(s == "straße");
Console.WriteLine(s.Equals("straße"));
Console.WriteLine(s.Equals("straße", StringComparison.Ordinal));
Console.WriteLine(s.Equals("Straße", StringComparison.CurrentCulture));
Console.WriteLine(s.Equals("straße", StringComparison.OrdinalIgnoreCase));
// outputs True:
Console.WriteLine(s.Equals("straße", StringComparison.CurrentCulture));
Console.WriteLine(s.Equals("Straße", StringComparison.CurrentCultureIgnoreCase));
The safest practice is to always provide a comparisonType
parameter
to the Equals
method.
Here are some basic guidelines:
-
When comparing strings that were input by the user, or are to be displayed to the user, use a culture-sensitive comparison (
CurrentCulture
orCurrentCultureIgnoreCase
). -
When comparing programmatic strings, use ordinal comparison (
Ordinal
orOrdinalIgnoreCase
). -
InvariantCulture
andInvariantCultureIgnoreCase
are generally not to be used except in very limited circumstances, because ordinal comparisons are more efficient. If a culture-aware comparison is necessary, it should usually be performed against the current culture or another specific culture.
In addition to the Equals
method,
strings also provide the Compare
method,
which gives you information about the relative order of strings instead of just a test for equality. This method is preferable to the <
, <=
, >
and >=
operators,
for the same reasons as discussed above–to avoid C# problems.
Common Mistake #4: Using iterative (instead of declarative) statements to manipulate collections
In C# 3.0, the addition of Language-Integrated Query (LINQ) to the language changed forever the way collections are queried and manipulated. Since then, if you’re using iterative statements to manipulate collections, you didn’t use LINQ when you probably should have.
Some C# programmers don’t even know of LINQ’s existence, but fortunately that number is becoming increasingly small. Many still think, though, that because of the similarity between LINQ keywords and SQL statements, its only use is in code that queries databases.
While database querying is a very prevalent use of LINQ statements, they actually work over any enumerable collection (i.e., any object that implements the IEnumerable interface). So for example, if you had an array of Accounts, instead of writing:
decimal total = 0;
foreach (Account account in myAccounts) {
if (account.Status == "active") {
total += account.Balance;
}
}
you could just write:
decimal total = (from account in myAccounts
where account.Status == "active"
select account.Balance).Sum();
While this is a pretty simple example of how to avoid this common C# programming problem, there are cases where a single LINQ statement can easily replace dozens of statements in an iterative loop (or nested loops) in your code. And less code general means less opportunities for bugs to be introduced. Keep in mind, however, there may be a trade-off in terms of performance. In performance-critical scenarios, especially where your iterative code is able to make assumptions about your collection that LINQ cannot, be sure to do a performance comparison between the two methods.
Common Mistake #5: Failing to consider the underlying objects in a LINQ statement
LINQ is great for abstracting the task of manipulating collections, whether they are in-memory objects, database tables, or XML documents. In a perfect world, you wouldn’t need to know what the underlying objects are. But the error here is assuming we live in a perfect world. In fact, identical LINQ statements can return different results when executed on the exact same data, if that data happens to be in a different format.
For instance, consider the following statement:
decimal total = (from account in myAccounts
where account.Status == "active"
select account.Balance).Sum();
What happens if one of the object’s account.Status
equals
“Active” (note the capital A)? Well, if myAccounts
was
a DbSet
object
(that was set up with the default case-insensitive configuration), the where
expression
would still match that element. However, if myAccounts
was
in an in-memory array, it would not match, and would therefore yield a different result for total.
But wait a minute. When we talked about string comparison earlier, we saw that the ==
operator
performed an ordinal comparison of strings. So why in this case is the ==
operator
performing a case-insensitive comparison?
The answer is that when the underlying objects in a LINQ statement are references to SQL table data (as is the case with the Entity Framework DbSet object in this example), the statement is converted into a T-SQL statement. Operators then follow T-SQL rules, not C# rules, so the comparison in the above case ends up being case insensitive.
In general, even though LINQ is a helpful and consistent way to query collections of objects, in reality you still need to know whether or not your statement will be translated to something other than C# under the hood to ensure that the behavior of your code will be as expected at runtime.
Common Mistake #6: Getting confused or faked out by extension methods
As mentioned earlier, LINQ statements work on any object that implements IEnumerable. For example, the following simple function will add up the balances on any collection of accounts:
public decimal SumAccounts(IEnumerable<Account> myAccounts) {
return myAccounts.Sum(a => a.Balance);
}
In the above code, the type of the myAccounts parameter is declared as IEnumerable<Account>
.
Since myAccounts
references
a Sum
method
(C# uses the familiar “dot notation” to reference a method on a class or interface), we’d expect to see a method called Sum()
on
the definition of the IEnumerable<T>
interface.
However, the definition of IEnumerable<T>
,
makes no reference to any Sum
method
and simply looks like this:
public interface IEnumerable<out T> : IEnumerable {
IEnumerator<T> GetEnumerator();
}
So where is the Sum()
method
defined? C# is strongly typed, so if the reference to the Sum
method
was invalid, the C# compiler would certainly flag it as an error. We therefore know that it must exist, but where? Moreover, where are the definitions of all the other methods that LINQ provides for querying or aggregating these collections?
The answer is that Sum()
is
not a method defined on the IEnumerable
interface.
Rather, it is a static method (called an “extension method”) that is defined on the System.Linq.Enumerable
class:
namespace System.Linq {
public static class Enumerable {
...
// the reference here to “this IEnumerable<TSource> source” is
// the magic sauce that provides access to the extension method Sum
public static decimal Sum<TSource>(this IEnumerable<TSource> source,
Func<TSource, decimal> selector);
...
}
}
So what makes an extension method different from any other static method and what enables us to access it in other classes?
The distinguishing characteristic of an extension method is the this
modifier
on its first parameter. This is the “magic” that identifies it to the compiler as an extension method. The type of the parameter it modifies (in this case IEnumerable<TSource>
)
denotes the class or interface which will then appear to implement this method.
(As a side point, there’s nothing magical about the similarity between the name of the IEnumerable
interface
and the name of the Enumerable
class
on which the extension method is defined. This similarity is just an arbitrary stylistic choice.)
With this understanding, we can also see that the sumAccounts
function
we introduced above could instead have been implemented as follows:
public decimal SumAccounts(IEnumerable<Account> myAccounts) {
return Enumerable.Sum(myAccounts, a => a.Balance);
}
The fact that we could have implemented it this way instead raises the question of why have extension methods at all? Extension methods are essentially a convenience of the C# language that enables you to “add” methods to existing types without creating a new derived type, recompiling, or otherwise modifying the original type.
Extension methods are brought into scope by including a using
[namespace];
statement at the top of the file. You need to know which namespace includes the extension methods you’re looking for, but that’s pretty easy to determine once you know what it is you’re searching for.
When the C# compiler encounters a method call on an instance of an object, and doesn’t find that method defined on the referenced object class, it then looks at all extension methods that are within scope to try to find one which matches the required method signature and class. If it finds one, it will pass the instance reference as the first argument to that extension method, then the rest of the arguments, if any, will be passed as subsequent arguments to the extension method. (If the C# compiler doesn’t find any corresponding extension method within scope, it will throw an error.)
Extension methods are an example of “syntactic sugar” on the part of the C# compiler, which allows us to write code that is (usually) clearer and more maintainable. Clearer, that is, if you’re aware of their usage. Otherwise, it can be a bit confusing, especially at first.
While there certainly are advantages to using extension methods, they can cause problems and a cry for C# help for those developers who aren’t aware of them or don’t properly understand them. This is especially true when looking at code samples online, or at any other pre-written code. When such code produces compiler errors (because it invokes methods that clearly aren’t defined on the classes they’re invoked on), the tendency is to think the code applies to a different version of the library, or to a different library altogether. A lot of time can be spent searching for a new version, or phantom “missing library”, that doesn’t exist.
Even developers who are familiar with extension methods still get caught occasionally, when there is a method with the same name on the object, but its method signature differs in a subtle way from that of the extension method. A lot of time can be wasted looking for a typo or error that just isn’t there.
Use of extension methods in C# libraries is becoming increasingly prevalent. In addition to LINQ, the Unity Application Block and the Web API framework are examples of two heavily-used modern libraries by Microsoft which make use of extension methods as well, and there are many others. The more modern the framework, the more likely it is that it will incorporate extension methods.
Of course, you can write your own extension methods as well. Realize, however, that while extension methods appear to get invoked just like regular instance methods, this is really just an illusion. In particular, your extension methods can’t reference private or protected members of the class they’re extending and therefore cannot serve as a complete replacement for more traditional class inheritance.
Common Mistake #7: Using the wrong type of collection for the task at hand
C# provides a large variety of collection objects, with the following being only a partial list:
Array
, ArrayList
, BitArray
, BitVector32
, Dictionary<K,V>
, HashTable
, HybridDictionary
, List<T>
, NameValueCollection
, OrderedDictionary
, Queue,
Queue<T>
, SortedList
, Stack,
Stack<T>
, StringCollection
, StringDictionary
.
While there can be cases where too many choices is as bad as not enough choices, that isn’t the case with collection objects. The number of options available can definitely work to your advantage. Take a little extra time upfront to research and choose the optimal collection type for your purpose. It will likely result in better performance and less room for error.
If there’s a collection type specifically targeted at the type of element you have (such as string or bit) lean toward using that one first. The implementation is generally more efficient when it’s targeted to a specific type of element.
To take advantage of the type safety of C#, you should usually prefer a generic interface over a non-generic one. The elements of a generic interface are of the type you specify when you declare your object, whereas the elements of non-generic interfaces are of type object. When using a non-generic interface, the C# compiler can’t type-check your code. Also, when dealing with collections of primitive value types, using a non-generic collection will result in repeated boxing/unboxing of those types, which can result in a significant negative performance impact when compared to a generic collection of the appropriate type.
Another common C# problem is to write your own collection object. That isn’t to say it’s never appropriate, but with as comprehensive a selection as the one .NET offers, you can probably save a lot of time by using or extending one that already exists, rather than reinventing the wheel. In particular, the C5 Generic Collection Library for C# and CLI offers a wide array of additional collections “out of the box”, such as persistent tree data structures, heap based priority queues, hash indexed array lists, linked lists, and much more.
Common Mistake #8: Neglecting to free resources
The CLR environment employs a garbage collector, so you don’t need to explicitly free the memory created for any object. In fact, you can’t. There’s no equivalent of the C++ delete
operator or
the free()
function
in C . But that doesn’t mean that you can just forget about all objects after you’re done using them. Many types of objects encapsulate some other type of system resource (e.g., a disk file, database connection, network socket, etc.). Leaving these resources
open can quickly deplete the total number of system resources, degrading performance and ultimately leading to program faults.
While a destructor method can be defined on any C# class, the problem with destructors (also called finalizers in C#) is that you can’t know for sure when they will be called. They are called by the garbage collector (on a separate thread, which can cause additional
complications) at an indeterminate time in the future. Trying to get around these limitations by forcing garbage collection with GC.Collect()
is
not a good practice, as that will block the thread for an unknown amount of time while it collects all objects eligible for collection.
This is not to say there are no good uses for finalizers, but freeing resources in a deterministic way isn’t one of them. Rather, when you’re operating on a file, network or database connection, you want to explicitly free the underlying resource as soon as you are done with it.
Resource leaks are a concern in almost any environment. However, C# provides a mechanism that is robust and simple to use which, if utilized, can make leaks a much rarer occurrence. The .NET framework defines the IDisposable
interface,
which consists solely of the Dispose()
method.
Any object which implements IDisposable
expects
to have that method called whenever the consumer of the object is finished manipulating it. This results in explicit, deterministic freeing of resources.
If you are creating and disposing of an object within the context of a single code block, it is basically inexcusable to forget to call Dispose()
,
because C# provides a using
statement
that will ensure Dispose()
gets
called no matter how the code block is exited (whether it be an exception, a return statement, or simply the closing of the block). And yes, that’s the same using
statement
mentioned previously that is used to include namespaces at the top of your file. It has a second, completely unrelated purpose, which many C# developers are unaware of; namely, to ensure that Dispose()
gets
called on an object when the code block is exited:
using (FileStream myFile = File.OpenRead("foo.txt")) {
myFile.Read(buffer, 0, 100);
}
By creating a using
block
in the above example, you know for sure that myFile.Dispose()
will
be called as soon as you’re done with the file, whether or not Read()
throws
an exception.
Common Mistake #9: Shying away from exceptions
C# continues its enforcement of type safety into runtime. This allows you to pinpoint errors much more quickly than in languages such as C++, where faulty type conversions can result in arbitrary values being assigned to an object’s fields. However, once again, programmers can squander this great feature,leading to C# problems. They fall into this trap because C# provides two different ways of doing things, one which can throw an exception, and one which won’t. Some will shy away from the exception route, figuring that not having to write a try/catch block saves them some coding.
For example, here are two different ways to perform an explicit type cast in C#:
// METHOD 1:
// Throws an exception if account can't be cast to SavingsAccount
SavingsAccount savingsAccount = (SavingsAccount)account;
// METHOD 2:
// Does NOT throw an exception if account can't be cast to
// SavingsAccount; will just set savingsAccount to null instead
SavingsAccount savingsAccount = account as SavingsAccount;
The most obvious error that could occur with the use of Method 2 would be a failure to check the return value. That would likely result in an eventual NullReferenceException, which could possibly surface at a much later time, making it much harder to track
down the source of the problem. In contrast, Method 1 would have immediately thrown an InvalidCastException
making
the source of the problem much more immediately obvious.
Moreover, even if you remember to check the return value in Method 2, what are you going to do if you find it to be null? Is the method you’re writing an appropriate place to report an error? Is there something else you can try if that cast fails? If not, then throwing an exception is the correct thing to do, so you might as well let it happen as close to the source of the problem as possible.
Here are a couple of examples of other common pairs of methods where one throws an exception and the other does not:
int.Parse(); // throws exception if argument can’t be parsed
int.TryParse(); // returns a bool to denote whether parse succeeded
IEnumerable.First(); // throws exception if sequence is empty
IEnumerable.FirstOrDefault(); // returns null/default value if sequence is empty
Some programmers are so “exception adverse” that they automatically assume the method that doesn’t throw an exception is superior. While there are certain select cases where this may be true, it is not at all correct as a generalization.
As a specific example, in a case where you have an alternative legitimate (e.g., default) action to take if an exception would have been generated, then that the non-exception approach could be a legitimate choice. In such a case, it may indeed be better to write something like this:
if (int.TryParse(myString, out myInt)) {
// use myInt
} else {
// use default value
}
instead of:
try {
myInt = int.Parse(myString);
// use myInt
} catch (FormatException) {
// use default value
}
However, it is incorrect to assume that TryParse
is
therefore necessarily the “better” method. Sometimes that’s the case, sometimes it’s not. That’s why there are two ways of doing it. Use the correct one for the context you are in, remembering that exceptions can certainly be your friend as a developer.
Common Mistake #10: Allowing compiler warnings to accumulate
While this problem is definitely not C# specific, it is particularly egregious in C# since it abandons the benefits of the strict type checking offered by the C# compiler.
Warnings are generated for a reason. While all C# compiler errors signify a defect in your code, many warnings do as well. What differentiates the two is that, in the case of a warning, the compiler has no problem emitting the instructions your code represents. Even so, it finds your code a little bit fishy, and there is a reasonable likelihood that your code doesn’t accurately reflect your intent.
A common simple example for the sake of this C# tutorial is when you modify your algorithm to eliminate the use of a variable you were using, but you forget to remove the variable declaration. The program will run perfectly, but the compiler will flag the useless variable declaration. The fact that the program runs perfectly causes programmers to neglect to fix the cause of the warning. Furthermore, programmers take advantage of a Visual Studio feature which makes it easy for them to hide the warnings in the “Error List” window so they can focus only on the errors. It doesn’t take long until there are dozens of warnings, all of them blissfully ignored (or even worse, hidden).
But if you ignore this type of warning, sooner or later, something like this may very well find its way into your code:
class Account {
int myId;
int Id; // compiler warned you about this, but you didn’t listen!
// Constructor
Account(int id) {
this.myId = Id; // OOPS!
}
}
And at the speed Intellisense allows us to write code, this error isn’t as improbable as it looks.
You now have a serious error in your program (although the compiler has only flagged it as a warning, for the reasons already explained), and depending on how complex your program is, you could waste a lot of time tracking this one down. Had you paid attention to this warning in the first place, you would have avoided this problem with a simple five-second fix.
Remember, the C# compiler gives you a lot of useful information about the robustness of your code… if you’re listening. Don’t ignore warnings. They usually only take a few seconds to fix, and fixing new ones when they happen can save you hours. Train yourself to expect the Visual Studio “Error List” window to display “0 Errors, 0 Warnings”, so that any warnings at all make you uncomfortable enough to address them immediately.
Of course, there are exceptions to every rule. Accordingly, there may be times when your code will look a bit fishy to the compiler, even though it is exactly how you intended it to be. In those very rare cases, use #pragma
warning disable [warning id]
around only the code that triggers the warning, and only for the warning ID that it triggers. This will suppress that warning, and that warning only, so that you can still stay alert for new ones.
Wrap-up
C# is a powerful and flexible language with many mechanisms and paradigms that can greatly improve productivity. As with any software tool or language, though, having a limited understanding or appreciation of its capabilities can sometimes be more of an impediment than a benefit, leaving one in the proverbial state of “knowing enough to be dangerous”.
Using a C# tutorial like this one to familiarize oneself with the key nuances of C#, such as (but by no means limited to) the problems raised in this article, will help optimize use of the language while avoiding some of its more common pitfalls.
译文:
关于本文
本文描述了10个 C# 程序员常犯的错误,或应该避免的陷阱。
尽管本文讨论的大多数错误是针对 C# 的,有些错误与其他以 CLR 为目标的语言,或者用到了 Framework Class Library (FCL) 的语言也相关。
常见错误 #1: 把引用当做值来用,或者反过来
C++ 和其他很多语言的程序员,习惯了给变量赋值的时候,要么赋单纯的值,要么是现有对象的引用。然而,在C# 中,是值还是引用,是由写这个对象的程序员决定的,而不是实例化对象并赋值的程序员决定的。这往往会坑到 C# 的新手程序员。
如果你不知道你正在使用的对象是否是值类型或引用类型,你可能会遇到一些惊喜。例如:
Point point1 = new Point(20, 30); Point point2 = point1; point2.X = 50; Console.WriteLine(point1.X); // 20 (does this surprise you?) Console.WriteLine(point2.X); // 50 Pen pen1 = new Pen(Color.Black); Pen pen2 = pen1; pen2.Color = Color.Blue; Console.WriteLine(pen1.Color); // Blue (or does this surprise you?) Console.WriteLine(pen2.Color); // Blue
如你所见,尽管Point和Pen对象的创建方式相同,但是当一个新的X的坐标值被分配到point2时, point1的值保持不变 。而当一个新的color值被分配到pen2,pen1也随之改变。因此,我们可以推断point1和point2每个都包含自己的Point对象的副本,而pen1和pen2引用了同一个Pen对象 。如果没有这个测试,我们怎么能够知道这个原理?
一种办法是去看一下对象是如何定义的(在Visual Studio中,你可以把光标放在对象的名字上,并按下F12键)
public struct Point { … } // defines a “value” type public class Pen { … } // defines a “reference” type
如上所示,在C#中,struct关键字是用来定义一个值类型,而class关键字是用来定义引用类型的。 对于那些有C++编程背景人来说,如果被C++和C#之间某些类似的关键字搞混,可能会对以上这种行为感到很吃惊。
如果你想要依赖的行为会因值类型和引用类型而异,举例来说,如果你想把一个对象作为参数传给一个方法,并在这个方法中修改这个对象的状态。你一定要确保你在处理正确的类型对象。
常见的错误#2:误会未初始化变量的默认值
在C#中,值得类型不能为空。根据定义,值的类型值,甚至初始化变量的值类型必须有一个值。这就是所谓的该类型的默认值。这通常会导致以下,意想不到的结果时,检查一个变量是否未初始化:
class Program { static Point point1; static Pen pen1; static void Main(string[] args) { Console.WriteLine(pen1 == null); // True Console.WriteLine(point1 == null); // False (huh?) } }
为什么不是【point 1】空?答案是,点是一个值类型,和默认值点(0,0)一样,没有空值。未能认识到这是一个非常简单和常见的错误,在C#中
很多(但是不是全部)值类型有一个【IsEmpty】属性,你可以看看它等于默认值:
Console.WriteLine(point1.IsEmpty); // True
当你检查一个变量是否已经初始化,确保你知道值未初始化是变量的类型,将会在默认情况下,不为空值。
常见错误 #3: 使用不恰当或未指定的方法比较字符串
在C#中有很多方法来比较字符串。
虽然有不少程序员使用==操作符来比较字符串,但是这种方法实际上是最不推荐使用的。主要原因是由于这种方法没有在代码中显示的指定使用哪种类型去比较字符串。
相反,在C#中判断字符串是否相等最好使用Equals方法:
public bool Equals(string value); public bool Equals(string value, StringComparison comparisonType);
第一个Equals方法(没有comparisonType这参数)和使用==操作符的结果是一样的,但好处是,它显式的指明了比较类型。它会按顺序逐字节的去比较字符串。在很多情况下,这正是你所期望的比较类型,尤其是当比较一些通过编程设置的字符串,像文件名,环境变量,属性等。在这些情况下,只要按顺序逐字节的比较就可以了。使用不带comparisonType参数的Equals方法进行比较的唯一一点不好的地方在于那些读你程序代码的人可能不知道你的比较类型是什么。
使用带comparisonType的Equals方法去比较字符串,不仅会使你的代码更清晰,还会使你去考虑清楚要用哪种类型去比较字符串。这种方法非常值得你去使用,因为尽管在英语中,按顺序进行的比较和按语言区域进行的比较之间并没有太多的区别,但是在其他的一些语种可能会有很大的不同。如果你忽略了这种可能性,无疑是为你自己在未来的道路上挖了很多“坑”。举例来说:
string s = "strasse"; // outputs False: Console.WriteLine(s == "straße"); Console.WriteLine(s.Equals("straße")); Console.WriteLine(s.Equals("straße", StringComparison.Ordinal)); Console.WriteLine(s.Equals("Straße", StringComparison.CurrentCulture)); Console.WriteLine(s.Equals("straße", StringComparison.OrdinalIgnoreCase)); // outputs True: Console.WriteLine(s.Equals("straße", StringComparison.CurrentCulture)); Console.WriteLine(s.Equals("Straße", StringComparison.CurrentCultureIgnoreCase));最安全的实践是总是为Equals方法提供一个comparisonType的参数。
下面是一些基本的指导原则:
当比较用户输入的字符串或者将字符串比较结果展示给用户时,使用本地化的比较(CurrentCulture 或者CurrentCultureIgnoreCase)。
当用于程序设计的比较字符串时,使用原始的比较(Ordinal 或者 OrdinalIgnoreCase)
InvariantCulture和InvariantCultureIgnoreCase一般并不使用,除非在受限的情境之下,因为原始的比较通常效率更高。如果与本地文化相关的比较是必不可少的,它应该被执行成基于当前的文化或者另一种特殊文化的比较。
此外,对Equals 方法来说,字符串也通常提供了Compare方法,可以提供字符串的相对顺序信息而不仅仅中测试是否相等。这个方法可以很好适用于<, <=, >和>= 运算符,对上述讨论同样适用。
常见误区 #4: 使用迭代式 (而不是声明式)的语句去操作集合
在C# 3.0中,LINQ的引入改变了我们以往对集合对象的查询和修改操作。从这以后,你应该用LINQ去操作集合,而不是通过迭代的方式。
一些C#的程序员甚至都不知道LINQ的存在,好在不知道的人正在逐步减少。但是还有些人误以为LINQ只用在数据库查询中,因为LINQ的关键字和SQL语句实在是太像了。
虽然数据库的查询操作是LINQ的一个非常典型的应用,但是它同样可以应用于各种可枚举的集合对象。(如:任何实现了IEnumerable接口的对象)。举例来说,如果你有一个Account类型的数组,不要写成下面这样:
decimal total = 0; foreach (Account account in myAccounts) { if (account.Status == "active") { total += account.Balance; } }
你只要这样写:
decimal total = (from account in myAccounts where account.Status == "active" select account.Balance).Sum();
虽然这是一个很简单的例子,在有些情况下,一个单一的LINQ语句可以轻易地替换掉你代码中一个迭代循环(或嵌套循环)里的几十条语句。更少的代码通常意味着产生Bug的机会也会更少地被引入。然而,记住,在性能方面可能要权衡一下。在性能很关键的场景,尤其是你的迭代代码能够对你的集合进行假设时,LINQ做不到,所以一定要在这两种方法之间比较一下性能。
#5常见错误:在LINQ语句之中没有考虑底层对象
对于处理抽象操纵集合任务,LINQ无疑是庞大的。无论他们是在内存的对象,数据库表,或者XML文档。在如此一个完美世界之中,你不需要知道底层对象。然而在这儿的错误是假设我们生活在一个完美世界之中。事实上,相同的LINQ语句能返回不同的结果,当在精确的相同数据上执行时,如果该数据碰巧在一个不同的格式之中。
例如,请考虑下面的语句:
1
2
3
|
decimal total=(from accout in myaccouts where accout.status==‘active" select accout .Balance).sum(); |
想象一下,该对象之一的账号会发生什么。状态等于“有效的”(注意大写A)?
好吧,如果myaccout是Dbset的对象。(默认设置了不同区分大小写的配置),where表达式仍会匹配该元素。然而,如果myaccout是在内存阵列之中,那么它将不匹配,因此将产生不同的总的结果。
等一会,在我们之前讨论过的字符串比较中, 我们看见 == 操作符扮演的角色就是简单的比较. 所以,为什么在这个条件下, == 表现出的是另外的一个形式呢 ?
答案是,当在LINQ语句中的基础对象都引用到SQL表中的数据(如与在这个例子中,在实体框架为DbSet的对象的情况下),该语句被转换成一个T-SQL语句。然后遵循的T-SQL的规则,而不是C#的规则,所以在上述情况下的比较结束是不区分大小写的。
一般情况下,即使LINQ是一个有益的和一致的方式来查询对象的集合,在现实中你还需要知道你的语句是否会被翻译成什么比C#的引擎或者是其他表达,来确保您的代码的行为将如预期在运行时。
常见错误 #6:对扩展方法感到困惑或者被它的形式欺骗
如同先前提到的,LINQ状态依赖于IEnumerable接口的实现对象,比如,下面的简单函数会合计帐户集合中的帐户余额:
public decimal SumAccounts(IEnumerable<Account> myAccounts) { return myAccounts.Sum(a => a.Balance); }
在上面的代码中,myAccounts参数的类型被声明为IEnumerable<Account>,myAccounts引用了一个Sum 方法 (C# 使用类似的 “dot notation” 引用方法或者接口中的类),我们期望在IEnumerable<T>接口中定义一个Sum()方法。但是,IEnumerable<T>没有为Sum方法提供任何引用并且只有如下所示的简洁定义:
public interface IEnumerable<out T> : IEnumerable { IEnumerator<T> GetEnumerator(); }
但是Sum方法应该定义到何处?C#是强类型的语言,因此如果Sum方法的引用是无效的,C#编译器会对其报错。我们知道它必须存在,但是应该在哪里呢?此外,LINQ提供的供查询和聚集结果所有方法在哪里定义呢?
答案是Sum并不在IEnumerable接口内定义,而是一个
定义在System.Linq.Enumerable类中的static方法(叫做“extension method”)
namespace System.Linq { public static class Enumerable { ... // the reference here to “this IEnumerable<TSource> source” is // the magic sauce that provides access to the extension method Sum public static decimal Sum<TSource>(this IEnumerable<TSource> source, Func<TSource, decimal> selector); ... } }
可是扩展方法和其它静态方法有什么不同之处,是什么确保我们可以在其它类访问它?
扩展方法的显著特点是第一个形参前的this修饰符。这就是编译器知道它是一个扩展方法的“奥妙”。它所修饰的参数的类型(这个例子中的IEnumerable<TSource>)说明这个类或者接口将显得实现了这个方法。
(另外需要指出的是,定义扩展方法的IEnumerable接口和Enumerable类的名字间的相似性没什么奇怪的。这种相似性只是随意的风格选择。)
理解了这一点,我们可以看到上面介绍的sumAccounts方法能以下面的方式实现:
public decimal SumAccounts(IEnumerable<Account> myAccounts) { return Enumerable.Sum(myAccounts, a => a.Balance); }
事实上我们可能已经这样实现了这个方法,而不是问什么要有扩展方法。扩展方法本身只是C#的一个方便你无需继承、重新编译或者修改原始代码就可以给已存的在类型“添加”方法的方式。
扩展方法通过在文件开头添加using [namespace];引入到作用域。你需要知道你要找的扩展方法所在的名字空间。如果你知道你要找的是什么,这点很容易。
当C#编译器碰到一个对象的实例调用了一个方法,并且它在这个对象的类中找不到那个方法,它就会尝试在作用域中所有的扩展方法里找一个匹配所要求的类和方法签名的。如果找到了,它就把实例的引用当做第一个参数传给那个扩展方法,然后如果有其它参数的话,再把它们依次传入扩展方法。(如果C#编译器没有在作用域中找到相应的扩展方法,它会抛措。)
对C#编译器来说,扩展方法是个“语法糖”,使我们能把代码写得更清晰,更易于维护(多数情况下)。显然,前提是你知道它的用法,否则,它会比较容易让人迷惑,尤其是一开始。
应用扩展方法确实有优势,但也会让那些对它不了解或者认识不正确的开发者头疼,浪费时间。尤其是在看在线示例代码,或者其它已经写好的代码的时候。当这些代码产生编译错误(因为它调用了那些显然没在被调用类型中定义的方法),一般的倾向是考虑代码是否应用于所引用类库的其它版本,甚至是不同的类库。很多时间会被花在找新版本,或者被认为“丢失”的类库上。
在扩展方法的名字和类中定义的方法的名字一样,只是在方法签名上有微小差异的时候,甚至那些熟悉扩展方法的开发者也偶尔犯上面的错误。很多时间会被花在寻找“不存在”的拼写错误上。
在C#中,用扩展方法变得越来越流行。除了LINQ,在另外两个出自微软现在被广泛使用的类库Unity Application Block和Web API framework中,也应用了扩展方法,而且还有很多其它的。框架越新,用扩展方法的可能性越大。
当然,你也可以写你自己的扩展方法。但是必须意识到虽然扩展方法看上去和其它实例方法一样被调用,但这实际只是幻。事实上,扩展方法不能访问所扩展类的私有和保护成员,所以它不能被当做传统继承的替代品。
常见错误 #7: 对手头上的任务使用错误的集合类型
C#提供了大量的集合类型的对象,下面只列出了其中的一部分:
Array,ArrayList,BitArray,BitVector32,Dictionary<K,V>,HashTable,HybridDictionary,List<T>,NameValueCollection,OrderedDictionary,Queue, Queue<T>,SortedList,Stack, Stack<T>,StringCollection,StringDictionary.
但是在有些情况下,有太多的选择和没有足够的选择一样糟糕,集合类型也是这样。数量众多的选择余地肯定可以保证是你的工作正常运转。但是你最好还是花一些时间提前搜索并了解一下集合类型,以便选择一个最适合你需要的集合类型。这最终会使你的程序性能更好,减少出错的可能。
如果有一个集合指定的元素类型(如string或bit)和你正在操作的一样,你最好优先选择使用它。当指定对应的元素类型时,这种集合的效率更高。
为了利用好C#中的类型安全,你最好选择使用一个泛型接口,而不是使用非泛型的借口。泛型接口中的元素类型是你在在声明对象时指定的类型,而非泛型中的元素是object类型。当使用一个非泛型的接口时,C#的编译器不能对你的代码进行类型检查。同样,当你在操作原生类型的集合时,使用非泛型的接口会导致C#对这些类型进行频繁的装箱(boxing)和拆箱(unboxing)操作。和使用指定了合适类型的泛型集合相比,这会带来很明显的性能影响。
另一个常见的陷阱是自己去实现一个集合类型。这并不是说永远不要这样做,你可以通过使用或扩展.NET提供的一些被广泛使用的集合类型来节省大量的时间,而不是去重复造轮子。 特别是,C#的C5 Generic Collection Library 和CLI提供了很多额外的集合类型,像持久化树形数据结构,基于堆的优先级队列,哈希索引的数组列表,链表等以及更多。
常见错误#8:遗漏资源释放
CLR 托管环境扮演了垃圾回收器的角色,所以你不需要显式释放已创建对象所占用的内存。事实上,你也不能显式释放。C#中没有与C++ delete对应的运算符或者与C语言中free()函数对应的方法。但这并不意味着你可以忽略所有的使用过的对象。许多对象类型封装了许多其它类型的系统资源(例如,磁盘文件,数据连接,网络端口等等)。保持这些资源使用状态会急剧耗尽系统的资源,削弱性能并且最终导致程序出错。
尽管所有C#的类中都定义了析构方法,但是销毁对象(C#中也叫做终结器)可能存在的问题是你不确定它们时候会被调用。他们在未来一个不确定的时间被垃圾回收器调用(一个异步的线程,此举可能引发额外的并发)。试图避免这种由垃圾回收器中GC.Collect()方法所施加的强制限制并非一种好的编程实践,因为可能在垃圾回收线程试图回收适宜回收的对象时,在不可预知的时间内致使线程阻塞。
这并意味着最好不要用终结器,显式释放资源并不会导致其中的任何一个后果。当你打开一个文件、网络端口或者数据连接时,当你不再使用这些资源时,你应该尽快的显式释放这些资源。
资源泄露几乎在所有的环境中都会引发关注。但是,C#提供了一种健壮的机制使资源的使用变得简单。如果合理利用,可以大增减少泄露出现的机率。NET framework定义了一个IDisposable接口,仅由一个Dispose()构成。任何实现IDisposable的接口的对象都会在对象生命周期结束调用Dispose()方法。调用结果明确而且决定性的释放占用的资源。
如果在一个代码段中创建并释放一个对象,却忘记调用Dispose()方法,这是不可原谅的,因为C#提供了using语句以确保无论代码以什么样的方式退出,Dispose()方法都会被调用(不管是异常,return语句,或者简单的代码段结束)。这个using和之前提到的在文件开头用来引入名字空间的一样。它有另外一个很多C#开发者都没有察觉的,完全不相关的目的,也就是确保代码退出时,对象的Dispose()方法被调用:
using (FileStream myFile = File.OpenRead("foo.txt")) { myFile.Read(buffer, 0, 100); }
在上面示例中使用using语句,你就可以确定myFile.Dispose()方法会在文件使用完之后被立即调用,不管Read()方法有没有抛异常。
常见错误 #9: 回避异常
C#在运行时也会强制进行类型检查。相对于像C++这样会给错误的类型转换赋一个随机值的语言来说,C#这可以使你更快的找到出错的位置。然而,程序员再一次无视了C#的这一特性。由于C#提供了两种类型检查的方式,一种会抛出异常,而另一种则不会,这很可能会使他们掉进这个“坑”里。有些程序员倾向于回避异常,并且认为不写 try/catch 语句可以节省一些代码。
例如,下面演示了C#中进行显示类型转换的两种不同的方式:
// 方法 1: // 如果 account 不能转换成 SavingAccount 会抛出异常 SavingsAccount savingsAccount = (SavingsAccount)account; // 方法 2: // 如果不能转换,则不会抛出异常,相反,它会返回 null SavingsAccount savingsAccount = account as SavingsAccount;
很明显,如果不对方法2返回的结果进行判断的话,最终很可能会产生一个 NullReferenceException 的异常,这可能会出现在稍晚些的时候,这使得问题更难追踪。对比来说,方法1会立即抛出一个 InvalidCastExceptionmaking,这样,问题的根源就很明显了。
此外,即使你知道要对方法2的返回值进行判断,如果你发现值为空,接下来你会怎么做?在这个方法中报告错误合适吗?如果类型转换失败了你还有其他的方法去尝试吗?如果没有的话,那么抛出这个异常是唯一正确的选择,并且异常的抛出点离其发生点越近越好。
下面的例子演示了其他一组常见的方法,一种会抛出异常,而另一种则不会:
int.Parse(); // 如果参数无法解析会抛出异常 int.TryParse(); // 返回bool值表示解析是否成功 IEnumerable.First(); // 如果序列为空,则抛出异常 IEnumerable.FirstOrDefault(); // 如果序列为空则返回 null 或默认值
有些程序员认为“异常有害”,所以他们自然而然的认为不抛出异常的程序显得更加“高大上”。虽然在某些情况下,这种观点是正确的,但是这种观点并不适用于所有的情况。
举个具体的例子,某些情况下当异常产生时,你有另一个可选的措施(如,默认值),那么,选用不抛出异常的方法是一个比较好的选择。在这种情况下,你最好像下面这样写:
if (int.TryParse(myString, out myInt)) { // use myInt } else { // use default value }
而不是这样:
try { myInt = int.Parse(myString); // use myInt } catch (FormatException) { // use default value }
但是,这并不说明 TryParse 方法更好。某些情况下适合,某些情况下则不适合。这就是为什么有两种方法供我们选择了。根据你的具体情况选择合适的方法,并记住,作为一个开发者,异常是完全可以成为你的朋友的。
常见错误 #10: 累积编译器警告而不处理
这个错误并不是C#所特有的,但是在C#中这种情况却比较多,尤其是从C#编译器弃用了严格的类型检查之后。
警告的出现是有原因的。所有C#的编译错误都表明你的代码有缺陷,同样,一些警告也是这样。这两者之间的区别在于,对于警告来说,编译器可以按照你代码的指示工作,但是,编译器发现你的代码有一点小问题,很有可能会使你的代码不能按照你的预期运行。
一个常见的例子是,你修改了你的代码,并移除了对某些变量的使用,但是,你忘了移除该变量的声明。程序可以很好的运行,但是编译器会提示有未使用的变量。程序可以很好的运行使得一些程序员不去修复警告。更有甚者,有些程序员很好的利用了Visual Studio中“错误列表”窗口的隐藏警告的功能,很容易的就把警告过滤了,以便专注于错误。不用多长时间,就会积累一堆警告,这些警告都被“惬意”的忽略了(更糟的是,隐藏掉了)。
但是,如果你忽略掉这一类的警告,类似于下面这个例子迟早会出现在你的代码中。
class Account { int myId; int Id; // 编译器已经警告过了,但是你不听 // Constructor Account(int id) { this.myId = Id; // OOPS! } }
再加上使用了编辑器的智能感知的功能,这种错误就很有可能发生。
现在,你的代码中有了一个严重的错误(但是编译器只是输出了一个警告,其原因已经解释过),这会浪费你大量的时间去查找这错误,具体情况由你的程序复杂程度决定。如果你一开始就注意到了这个警告,你只需要5秒钟就可以修改掉,从而避免这个问题。
记住,如果你仔细看的话,你会发现,C#编译器给了你很多关于你程序健壮性的有用的信息。不要忽略警告。你只需花几秒钟的时间就可以修复它们,当出现的时候就去修复它,这可以为你节省很多时间。试着为自己培养一种“洁癖”,让Visual Studio 的“错误窗口”一直显示“0错误, 0警告”,一旦出现警告就感觉不舒服,然后即刻把警告修复掉。
当然了,任何规则都有例外。所以,有些时候,虽然你的代码在编译器看来是有点问题的,但是这正是你想要的。在这种很少见的情况下,你最好使用 #pragma warning disable [warning id] 把引发警告的代码包裹起来,而且只包裹警告ID对应的代码。这会且只会压制对应的警告,所以当有新的警告产生的时候,你还是会知道的。
总结
C#是一门强大的并且很灵活的语言,它有很多机制和语言规范来显著的提高你的生产力。和其他语言一样,如果对它能力的了解有限,这很可能会给你带来阻碍,而不是好处。正如一句谚语所说的那样“knowing enough to be dangerous”(译者注:意思是自以为已经了解足够了,可以做某事了,但其实不是)。
熟悉C#的一些关键的细微之处,像本文中所提到的那些(但不限于这些),可以帮助我们更好的去使用语言,从而避免一些常见的陷阱。