Hi all
So, the other day I had this requirement for a BizTalk pipeline component:
Take an InfoPath formula and convert it into a PDF that is to be sent out via email. This seemed easy enough. I searched a bit, and found that three simple steps were needed:
1: FormControl formControl = new FormControl();
2: formControl.Open(pInMsg.Data);
3: string output = Path.GetTempFileName();
4: formControl.XmlForm.CurrentView.Export(output, Microsoft.Office.InfoPath.ExportFormat.Pdf);
Of course, this would also mean some code that would read the pdf file back in and then create the output message. But hey, that was just the price I had to pay.
BUT… I was being naive… As the more clever of my readers have probably all ready realized, if something is called FORMcontrol, then it is for programs that have a UI. The code crashed big time at runtime with some ActiveX exception :-(
Then I remembered that I have a colleague who had previously told me that she had done this at some point, so I emailed her for her code.
Unfortunately, her code involved taking the form, extracting the XSL from the XSN file, perform a transformation on the XML using the XSL which will generate HTML and then using some utility to convert this into PDF. This was more complex than I had hoped, but I saw no other way. Unfortunately, her code had this line in it:
1: StreamReader stream = new StreamReader(XmlFormView.XmlForm.Template.OpenFileFromPackage("View1.xsl"));
So, it seems that I will have to do a lot of dirty work myself :-(
This turned into quite a list of subtasks:
<?mso-infoPathSolution solutionVersion="1.0.0.2" productVersion="12.0.0" PIVersion="1.0.0.0" href="http://path.to/form.xsn" name="urn:schemas-microsoft-com:office:infopath:MyForm:-myXSD-2009-09-21T15-43-10" ?>
So I am now going from the few lines of code I was hoping for to a more complex solution… so lets look at the code:
First of all, I need the value of the processing instruction. This is easily done:
1: private static string GetHrefFromXml(XmlDocument infoPathForm)
2: {
3: XmlNode piNode = infoPathForm.SelectSingleNode("/processing-instruction(\"mso-infoPathSolution\")");
4: if (piNode != null && piNode is XmlProcessingInstruction)
5: {
6: var pi = (XmlProcessingInstruction)piNode;
7: string href = pi.Value;
8: int location = href.IndexOf(Href);
9: if (location != -1)
10: {
11: href = href.Substring(location + Href.Length);
12: href = href.Substring(0, href.IndexOf("\""));
13: return href;
14: }
15: throw new ApplicationException("No href attribute was found in the procesing instruction (mso-infoPathSolution). Without this, the location of the form cannot be detected and without the form no PDF can be generated.");
16: }
17: throw new ApplicationException("Required XML processing instruction (mso-infoPathSolution) not found. Without this, the location of the form cannot be detected and without the form no PDF can be generated.");
18: }
The most annoying part is, that the value of a processing instruction can be anything. In this case, it appears to be a list of attributes like “normal” XML, but since this is not guaranteed, there is no language support for getting the value of the href “attribute”. So I chose to use string manipulation to get the value.
After getting the href, I need to get the XSN file from SharePoint Server, where the form is published. This turned out to be a challenge also.
My first approach was quite simple:
1: private static byte[] GetFormByUrl(string href)
3: var wc = new WebClient
4: {
5: Credentials = CredentialCache.DefaultCredentials
6: };
7: return wc.DownloadData(href);
8: }
This turned out to be something silly, though. What happens when SharePoint and Forms Server get a request for the XSN file, it assumes some one is trying to fill out the form. So what I got back was the HTML that the Forms Server was sending a user that wanted to fill out the form. Then I thought I’d try to do this:
3: HttpWebRequest wr = (HttpWebRequest)HttpWebRequest.Create(href);
4: wr.AllowAutoRedirect = false;
5: WebResponse resp = wr.GetResponse();
6: Stream stream = resp.GetResponseStream();
7: using (MemoryStream ms = new MemoryStream())
8: {
9: byte[] buffer = new byte[1024];
10: int bytes = 0;
11: while ((bytes = stream.Read(buffer,0, buffer.Length)) != -1)
12: ms.Write(buffer,0,bytes);
13: return ms.ToArray();
15: }
Basically, using an HttpWebRequest I could ask it to not redirect. This didn’t work either, since what I then got back was some HTML that basically just said that the page has moved. Bummer.
But then another colleague who apparently is better at searching than I am found out that I can add a noredirect parameter to my request that will instruct SharePoint to not redirect. This is different from my current approach because my current approach instructs .NET to not follow redirects, whereas this new approach instructs SharePoint to not ask me to redirect.
So I ended up with something as simple as this:
3: string url = href + "?noredirect=true";
4: var wc = new WebClient
6: Credentials = CredentialCache.DefaultCredentials
7: };
8: return wc.DownloadData(url);
9: }
Simple and beautiful
Now I have the XSN file and the next issue pops up, naturally; How do I get the XSL extracted from the XSN file. The XSN file is just a cabinet file with another extension, so I thought this must be easy. I found out it is not. I searched and searched and ended up finding all sorts of weird stuff where people used p/invoke to do stuff and what not. I am confused that Microsoft have not added at least extraction functionality to the .NET framework, but they haven’t.
I ended up doing this:
1: private static string ExtractCabFile(string cabFile)
3: string destDir = CreateTmp(true, "");
4:
5: var sh = new Shell();
6: Folder fldr = sh.NameSpace(destDir);
7: foreach (FolderItem f in sh.NameSpace(cabFile).Items())
8: fldr.CopyHere(f, 0);
9: return destDir;
10: }
This code assumes that the XSN file has been written to a temporary file with the extension .CAB – this is very important, since the shell command will open up the .CAB file with the default program, which is then the explorer. After that, all files in the cabinet file is copied to “destDir” which is just a directory created in the users Temp directory.
I am quite annoyed to have to go through all this, but that’s how things go sometimes.
So now I have found the href of the form, downloaded the form and extracted its files. Time for the transformation:
1: private static MemoryStream PerformTransformation(XmlDocument xmldoc, string destDir, string view)
3: var transform = new XslCompiledTransform();
4: var stream = new StreamReader(destDir + @"\" + view + ".xsl");
5: XmlReader xmlReader = XmlReader.Create(stream);
6: transform.Load(xmlReader);
7:
8: var outputMemStream = new MemoryStream();
9: transform.Transform(xmldoc, null, outputMemStream);
10: stream.Close();
11: xmlReader.Close();
12: outputMemStream.Seek(0, SeekOrigin.Begin);
13: return outputMemStream;
So just a normal XSLT transformation, resulting in some HTML that is returned in a stream.
After this, I need to convert it into PDF, which is really simple using a tool we bought for this:
1: private static byte[] GetPdfFromHtml(Parameters param)
3: var pdfConverter = new PdfConverter
5: LicenseKey = "SomethingElse - You are not getting the correct License Key "
8: byte[] pdfBytes = pdfConverter.GetPdfBytesFromHtmlStream(param.HtmlStream, Encoding.UTF8, param.DestDir.EndsWith(@"\") ? param.DestDir : param.DestDir + @"\");
9: return pdfBytes;
We are using the ExpertPDF library for this. The third parameter for the GetPdfBytesFromHtmlStream method call is the directory where the cabinet file was extracted to, since this is where all images used in the form are also kept and they are needed for the PDF to include them.
All in all; the component now works, but it turned out to be a lot more difficult than I had hoped.
As a last detail, I added a property to my pipeline component that the developer can use to decide which view to use for the transformation form XML to HTML.
The complete code for the pipeline component will not be available for download, since this was done for a customer, but I might do something a bit smaller and simpler and add it to my pipeline component collection later on.
--
eliasen
Remember Me
a@href@title, b, blockquote@cite, em, i, strike, strong, sub, sup, u
Theme design by Jelle Druyts
Powered by: newtelligence dasBlog 2.3.9074.18820
The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.
© Copyright 2025, Jan Eliasen
E-mail