Atom content with a type="xhtml"

Topics: Argotic.Core
Jun 4, 2008 at 1:41 AM
Thanks for the awesome library.  It has saved me a lot of time.

I just ran into an internal feed that uses content type xhtml.  And I noticed that all the content is being stripped.  I was able to replicate it with this simple feed...

<?xml version="1.0" encoding="utf-8" ?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>Example Feed</title>
  <link href="http://example.org/"/>
  <updated>2003-12-13T18:30:02Z</updated>
  <author>
    <name>John Doe</name>
  </author>
  <id>urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6</id>
  <entry>
    <title>Atom-Powered Robots Run Amok</title>
    <link href="http://example.org/2003/12/13/atom03"/>
    <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
    <updated>2003-12-13T18:30:02Z</updated>
    <content type="xhtml">
      <div xmlns="http://www.w3.org/1999/xhtml">
          <p>AT&amp;T bought <b>by SBC</b>!</p>
          <p>See all the news <a href="http://www.example.org/bought/sbc">here</a></p>
      </div>
    </content>
  </entry>
</feed>

I would expect that an AtomContent.Content would contain the string "<p xmlns="http://www.w3.org/1999/xhtml">AT&amp;T bought <b>by SBC</b>!</p>
<p xmlns="http://www.w3.org/1999/xhtml">See all the news <a href="http://www.example.org/bought/sbc">here</a></p>".  But on line 585 (of what is in TFS not a release), XPathNavigator.Value is being used which grabs all the text nodes and combines them.  Doing that I end up with "AT&T bought by SBC!See all the news here".

I think it would be more appropriate to use XPathNavigator.InnerXml in this instance.

Thoughts?

Justin Rudd
Jun 4, 2008 at 5:56 AM
Justin,

I am in agreement that in the case where Atom content is encoded using XHTML (or HTML), the encoded value should be returned. Good catch, I will try to get an issue opened to fix this. I have been swamped lately, and am falling behind on Argotic, but I really appreciate all of the great feedback people have been providing.
Jun 4, 2008 at 9:27 PM
Hey Brian,

Thanks for the quick response! 

I've got the code patched up already.  I'll post the diff file somewhere and update this discussion.  As for "type='html'", Argotic is doing the right thing (as far as I can tell).  As HTML, my above would look like this raw -

&lt;p&gt;AT&amp;amp;T bought &lt;b&gt;by SBC&lt;/b&gt;!&lt;/p&gt;
&lt;p&gt;See all the news &lt;a href="http://www.example.org/bought/sbc"&gt;here&lt;/a&gt;&lt;/p&gt;

But when pulled out, it is unescaped properly.
Jul 1, 2008 at 6:46 PM
Edited Jul 1, 2008 at 6:47 PM
Created work item #10408 to address this issue, this has been fixed for the next release.