Profile_bird

Hey there! pedralho is using Twitter.

Twitter is a free service that lets you keep in touch with people through the exchange of quick, frequent answers to one simple question: What's happening? Join today to start receiving pedralho's tweets.

Already using Twitter
from your phone? Click here.

pedralho

  1. This ugly name was my idea, but the project itself is @tonikitoo's as well.
  2. New repository created in Gitorious.org: Illusion Browser: http://gitorious.org/illusi...
  3. New blog post in http://pedralho.blogspot.com/: Capturing Web page frames in image files using WebKit-Qt
  4. Wow! A git fetch from Gitorious.org fails in my house due to my slow Internet bandwidth.
  5. Blog updated: http://pedralho.blogspot.com/ -> "WebKit-EFL in progress"
  6. Great! Found a good and easy REGEX! ((?:<(?:/)?(?:.*?)(?:/)?>)(?:[\\s\\u00A0]+)(?:<(?:/)?(?:.*?)(?:/)?>))
  7. For now I just need a regular expression that removes whitespaces between HTML tags. It should not remove the spaces between text nodes!
  8. How to clean up an HTML? Remove whitespace between nodes, scripts, comments... Tidy would be my answer.
  9. What about look for repetitive subtrees? I'm betting on this!
  10. How to find the structured part of a HTML: can be a table, a list, a product item in the Amazon page...
  11. I get really sad when my algorithm fails when I'm sure it is correct...
  12. Support Spread Firefox!, add a #twibbon to your avatar now! - http://bit.ly/4vyhUX
  13. Support I'm root!, add a #twibbon to your avatar now! - http://bit.ly/1SMgz0
  14. Starting now a professional twitter account! Just English posts related to Linux, Mozilla, WebKit, Data Mining, Data Labeling and so on...