Teve-Blad seems to have changed its layout

Asked by Bernard

I think I've got the latest version (tar.gz from the download section), but it still doesn't seem to match the Teve-Blad websites. Did they change their pages again?

The page parsed by the grabber is something like "http://www.teveblad.be/ndl/zender.asp?move=full&channel=canvas&dag=11/20/2008" right? Well if you open that page, it only contains links to subpages with the real program guide like eg "http://www.teveblad.be/ndl/zender.asp?pagina=1&channel=canvas&dag=11/20/2008"

Additionally, the page containing the detailed program information doesn't contain the string "progid" anymore. It's now abbreviated to "pid".

Don't know yet if there are other changes as I'm still trying to make it work. If I find something else, I'll let you know

Question information

Language:
English Edit question
Status:
Solved
For:
BelGuide Edit question
Assignee:
No assignee Edit question
Solved by:
Bernard
Solved:
Last query:
Last reply:
Revision history for this message
Bernard (renardeau) said :
#1

In the detail pages, "detailtitles" is replaced by tb_dt and "detailcontent" is replaced by tb_dc

Revision history for this message
Hulkie (hulkie2) said :
#2

Yes,

they changed the layout and it is pretty difficult to fix it. In fact, the main problem is the fact that they have removed the option to show both pages (if there are two pages) at once (this used to be possible with 'move=full'). What's more, it is impossible to surf to the second page directly. For instance, if you try:
http://www.teveblad.be/ndl/zender.asp?pagina=2&channel=canvas&dag=11/20/2008
Then he will show you page 1 nonetheless. If you try again (by pressing reload for example in your browser) then he will show page two correctly. Thus, it appears that they ask the browser if the previous site you visited was page 1 and only then will he show you page two. As I directly download the website, I suppose no valid response to the question what my previous visited page is given and therefore he gives me page 1 again.
Thus, while I can correct the things you suggested, you are still left with holes in the program guide... Perhaps if we could script firefox to download the relevant pages.. But I don't know how to do this. Another possibility is to look for another site...

If you find solutions to this problem, please keep me informed.
Hulkie

Revision history for this message
Bernard (renardeau) said :
#3

The full option still seem to exist in the 'personalisatie'-pages.
for example http://www.teveblad.be/ndl/person.asp?move=full&Pers_nummer=442526
Maybe we can use those pages to solve the problem by creating a page for every channel. Or creating one page showing all desired channels?
I saw they are using cookies (te website doesn't work if you deactivate them), so probably, that's where they save the previously visited page info. So, if we can find and change the cookie, we'll probably be able to access the second page directly, too.

Didn't test any of these yet, cause I've got another annoying problem: all the programs listed in my program guide are 1 hour behind... Is there a UTC to/from local time conversion somewhere?

greets
B.

Revision history for this message
Bernard (renardeau) said :
#4

Solved by replacing WebClient by HttpWebRequest to have cookie support

  // gets cookie from location1 and loads page of location2
  public bool LoadFile(string location) {
   try {
    theParser.CleanUp();

    Console.WriteLine("Start request");
    HttpWebRequest request = (HttpWebRequest)WebRequest.Create(location);
    request.Timeout=1000; //1 sec (in millisecs)
    request.CookieContainer = new CookieContainer();

    HttpWebResponse response = (HttpWebResponse) request.GetResponse();

    theParser.Init(getMsgContent(response));
                               response.Close();
    return true;
   } catch (Exception e) {
    Console.WriteLine(e.ToString());
    return false;
   }
  }

  // gets cookie from location1 and loads page of location2
  public bool LoadFileWithCookie(string locationCookie, string locationPage) {
   try {
    theParser.CleanUp();

    Console.WriteLine("Start first request to get cookie");
    HttpWebRequest request = (HttpWebRequest)WebRequest.Create(locationCookie);
    request.Timeout=1000; //1 sec (in millisecs)
    request.CookieContainer = new CookieContainer();

    HttpWebResponse response = (HttpWebResponse) request.GetResponse();
    Console.WriteLine("Start second request to get data");
    HttpWebRequest request2 = (HttpWebRequest)WebRequest.Create(locationPage);
    request2.Timeout=1000; //1 sec (in millisecs)
    CookieContainer cookCont = new CookieContainer();
    cookCont.Add(response.Cookies);
    request2.CookieContainer = cookCont;
    Console.WriteLine("Cookie copied -> launch request");
    response.Close();
    response = (HttpWebResponse) request2.GetResponse();

    theParser.Init(getMsgContent(response));
    return true;
   } catch (Exception e) {
    Console.WriteLine(e.ToString());
    //response.Close ();
    return false;
   }
  }

  public string getMsgContent(HttpWebResponse response)
  {
            // Get the stream containing content returned by the server.
            Stream dataStream = response.GetResponseStream ();
            // Open the stream using a StreamReader for easy access.
            StreamReader reader = new StreamReader (dataStream);
            // Read the content.
            string responseFromServer = reader.ReadToEnd ();
   //Console.WriteLine("Downloaded data:"+responseFromServer);

            reader.Close ();
            dataStream.Close ();
            return responseFromServer;
        }

Time issue also solved by using +0100 suffix for start and end time instead of +0000

Revision history for this message
Hulkie (hulkie2) said :
#5

Hey,

thanks for helping me solve this because I had no idea of how to do it. I just found a way to do it with macros in firefox, but this way is much much faster and cleaner! However, I did not find a neat way to integrate your code directly, but I used the core of it and it is working perfectly again. I will upload the new program (the majestic parser library was updated as well, apparently it is now 2 times faster..) as soon as possible.

Thanks,
Hulkie