Yesterday I was asked a question. A collegue asked how to convert an UTF-16 xml file to an UTF-8 one. He was using xslt for the conversion. I took a look at his code and everything looked fine. He was using a StringBuilder, StringWriter and a XmlWriter. He had set the encoding settings for the XmlWriter to UTF-8. Should work. Shouldn’t it? No, it does not work because a .Net string has a UTF-16 encoding and it cannot be changed, of course. So even though my collegue was setting the encoding it would not be used anyway.
Another approach was needed and I found one. The XmlWriter accepts a stream also. A MemoryStream can be used. It is just a buffer of bytes and won’t do anything with the content. Here is some example code. The first example uses a StringBuilder and doesn’t work. The second example uses the MemoryStream.
Example 1 (does not work):
StringBuilder sb = new StringBuilder(); using (XmlWriter xw = XmlWriter.Create(new StringWriter(sb))) { XslCompiledTransform xct = new XslCompiledTransform(); xct.Load(@"cdcatalog.xsl"); xct.Transform(@"cdcatalog.xml", xw); xw.Flush(); } Console.WriteLine(sb.ToString());
Example 2 (works):
MemoryStream ms = new MemoryStream(); using (XmlWriter xw = XmlWriter.Create(new StreamWriter(ms))) { XslCompiledTransform xct = new XslCompiledTransform(); xct.Load(@"cdcatalog.xsl"); xct.Transform(@"cdcatalog.xml", xw); xw.Flush(); } string xmlOutput = Encoding.UTF8.GetString(ms.GetBuffer()); Console.WriteLine(xmlOutput);
You can download an example project here.