Author Topic: multi-page support  (Read 28781 times)

Offline chip!

  • Bad Ass
  • Administrator
  • Unstoppable
  • *****
  • Posts: 2301
  • Karma: +629/-6
    • View Profile
multi-page support
« on: March 03, 2014, 01:57:29 pm »
using the latest beta with the new functions resubmit() and getcookie(),  we are able to achieve a working multi-page search...  attached is a proof in concept, except that it does not respect the BC->search result limit because this is a rather nifty hack on the script engine using a cookie as a 'settings' variable and resubmiting the request until there are no more results.. hmm :)
  -  https://convivea.com  -   And...  boom goes the dynamite.

Offline chip!

  • Bad Ass
  • Administrator
  • Unstoppable
  • *****
  • Posts: 2301
  • Karma: +629/-6
    • View Profile
Re: multi-page support
« Reply #1 on: April 14, 2014, 11:45:11 am »
this POC will still work with the latest builds of BC.. however, proper/standardized multi-page support has been enabled and will be available in any built 2.0.71+
  -  https://convivea.com  -   And...  boom goes the dynamite.

Offline CuF

  • Sr. Member
  • ****
  • Posts: 331
  • Karma: +40/-0
    • View Profile
Re: multi-page support
« Reply #2 on: December 21, 2014, 07:57:01 am »
Is multipage officially supported now?  I tried playing with the sample script and I'm definitely a bit lost on how to implement it.


Offline chip!

  • Bad Ass
  • Administrator
  • Unstoppable
  • *****
  • Posts: 2301
  • Karma: +629/-6
    • View Profile
Re: multi-page support
« Reply #3 on: December 21, 2014, 10:22:35 am »
yeah, it has been working for the scripts that are enabled..   here was a little write up on how to use it (i suppose you have seen this):  http://convivea.com/forums/index.php/topic,2398.0.html 

basically,

a. you just add "[nextpage]" section to the .ini 

b. set 'morepages=' for the exact HTML that BC should search for to indicate that there is more pages to fetch.. normally this would be some HTML associated with a 'More' button.. or 'Next'.. usually on the webpage there will be a specific button to go to the next result page, or an icon image that represents there are more pages.

c. then configure type=1 and initial=1  (see the little descriptions of each)

d. and finally, make sure to use %PAGENUM% var in your Search URL.  when BC finds the 'morepages=' HTML in the results, it then updates the search URL to fetch the next page.


this script is working:  https://github.com/convivea/bit-che-3-scripts/blob/master/scripts/default/KickassTorrents.com.ini

but let me know where you are getting stuck so i can help better :)



  -  https://convivea.com  -   And...  boom goes the dynamite.

Offline chip!

  • Bad Ass
  • Administrator
  • Unstoppable
  • *****
  • Posts: 2301
  • Karma: +629/-6
    • View Profile
Re: multi-page support
« Reply #4 on: December 21, 2014, 10:48:01 am »
as an example, for the new btdigg script,  I would try setting

Code: [Select]
morepages=>Next →</a>
because that Next button always has a link if there are more pages, but when you get to the last page, the button loses the href link, and becomes: 
Code: [Select]
<td>Next →</td>  so Bit Che would not try to fetch any more pages because the 'morepages=' html would not be found in the last page (since the
Code: [Select]
</a> is not there on the last page).



and for the others:

type=1  (because the pages counts sequentially, 1, 2, 3, 4, etc)
initial=0  (because the first page is
Code: [Select]
/search?q=search&p=0  (page 1 actually starts at page 0..)

and then for the search URL:

Code: [Select]
/search?q=%SEARCH%&p=%PAGENUM%

I haven't tried it yet, but just looking at the page, i think that would work.. let me know?
« Last Edit: December 21, 2014, 10:53:06 am by chip! »
  -  https://convivea.com  -   And...  boom goes the dynamite.

Offline CuF

  • Sr. Member
  • ****
  • Posts: 331
  • Karma: +40/-0
    • View Profile
Re: multi-page support
« Reply #5 on: December 21, 2014, 03:59:55 pm »
Thanks, I thought you had that info somewhere.  My search only turned up this thread.

I came to the same conclusion about "morepages=>Next →[/url]", except for a few things.

First the → symbol requires the script be in Unicode format.   ANSI will lose the symbol and UTF-8 scripts aren't read properly by BitChe.
The other problem is it still doesn't work.  It grabs the first 10 hits than stops.
The debugger doesn't provide any info that might help, but it IS properly getting page zero.

Script attached.

Offline chip!

  • Bad Ass
  • Administrator
  • Unstoppable
  • *****
  • Posts: 2301
  • Karma: +629/-6
    • View Profile
Re: multi-page support
« Reply #6 on: December 22, 2014, 04:37:57 am »
Thanks... I'll try to see where I can improve this. The odd thing, your script with multi paging works for me.
  -  https://convivea.com  -   And...  boom goes the dynamite.

Offline CuF

  • Sr. Member
  • ****
  • Posts: 331
  • Karma: +40/-0
    • View Profile
Re: multi-page support
« Reply #7 on: December 22, 2014, 05:08:21 am »
Just to be clear, I'm running v3.0 build 10 and results are set to unlimited.  (Also tried lower numbers to rule that out.)

Edit: Just noticed I didn't have "Fetch results from additional pages when there are multiple pages" enabled... and that still didn't fix the problem.

Btw, off topic, but can Seeds/Leeches be extracted from magnet links?
« Last Edit: December 22, 2014, 05:13:29 am by CuF »

Offline Artax

  • Newbie Member
  • *
  • Posts: 19
  • Karma: +1/-0
    • View Profile
Re: multi-page support
« Reply #8 on: December 22, 2014, 07:46:13 am »
CuF, I don't know if this can be useful, but btdigg works well on qbittorrent, a client with a search engine based on python
« Last Edit: December 22, 2014, 07:48:16 am by Artax »

Offline chip!

  • Bad Ass
  • Administrator
  • Unstoppable
  • *****
  • Posts: 2301
  • Karma: +629/-6
    • View Profile
Re: multi-page support
« Reply #9 on: December 22, 2014, 02:09:39 pm »
First the → symbol requires the script be in Unicode format.   ANSI will lose the symbol and UTF-8 scripts aren't read properly by BitChe.

alright, I made some progress... Unicode scripts can now be used and also the debugger can show those characters (such as → ).

also, I included some 'morepage' stuff in the debugger, so at least now you can tell ahead of time if BC will launch the next page.

and, I added a 'Run' button to the debugger, so you can let the script finish processing the current page, and the debugger will stay loaded and ready for once the 2nd page is fetched.

will push for the next beta.
« Last Edit: December 22, 2014, 02:20:21 pm by chip! »
  -  https://convivea.com  -   And...  boom goes the dynamite.

Offline CuF

  • Sr. Member
  • ****
  • Posts: 331
  • Karma: +40/-0
    • View Profile
Re: multi-page support
« Reply #10 on: December 22, 2014, 02:43:11 pm »
Cool.  Looking forward to figuring this out.
I think I tried multipaging once before and failed then too.

Offline chip!

  • Bad Ass
  • Administrator
  • Unstoppable
  • *****
  • Posts: 2301
  • Karma: +629/-6
    • View Profile
Re: multi-page support
« Reply #11 on: January 01, 2015, 08:10:45 am »
okee dokey.. grab the latest from here: http://convivea.com/forums/index.php/topic,2415.msg21136.html#msg21136

in the debugger you will notice at the bottom of the window, details on morepage to help with multi-page stuff.

for the html, you can use @data=utf8decode(@data) if the source HTML page in UTF-8 character set and you have issues viewing certain characters.

let me know what you think?
  -  https://convivea.com  -   And...  boom goes the dynamite.

Offline CuF

  • Sr. Member
  • ****
  • Posts: 331
  • Karma: +40/-0
    • View Profile
Re: multi-page support
« Reply #12 on: January 06, 2015, 08:06:53 am »
Nice to have that info in the debugger.  I see:
Code: [Select]
MorePages:
Find: >Next →</a>
Result: Found
Current Counter: 0
Current Page: http://btdigg.org/search?info_hash=&order=0&q=cow&p=0
Next Counter: 1
Next Page: http://btdigg.org/search?info_hash=&order=0&q=cow&p=1

But after the first 10 in the @r Bit Che still stops.  I must be missing something.

2 other thoughts that are a bit off topic.
The latest version of BC, 3.5 build 8, has a weird bug that if you click away from the Debugger and then click in it to bring it to the front if flickers like mad.

Also, BTDigg uses Magnet links.  Do I have to use:
Code: [Select]
$seeds=put(1)
$leeches=put(1)
...or is it possible to get the real numbers?

Offline chip!

  • Bad Ass
  • Administrator
  • Unstoppable
  • *****
  • Posts: 2301
  • Karma: +629/-6
    • View Profile
Re: multi-page support
« Reply #13 on: January 16, 2015, 01:50:46 pm »
Nice to have that info in the debugger.  I see:
Code: [Select]
MorePages:
Find: >Next →</a>
Result: Found
Current Counter: 0
Current Page: http://btdigg.org/search?info_hash=&order=0&q=cow&p=0
Next Counter: 1
Next Page: http://btdigg.org/search?info_hash=&order=0&q=cow&p=1

But after the first 10 in the @r Bit Che still stops.  I must be missing something.

2 other thoughts that are a bit off topic.
The latest version of BC, 3.5 build 8, has a weird bug that if you click away from the Debugger and then click in it to bring it to the front if flickers like mad.

Also, BTDigg uses Magnet links.  Do I have to use:
Code: [Select]
$seeds=put(1)
$leeches=put(1)
...or is it possible to get the real numbers?

So in the debugger, on the first http page request, there are only 10 results given, so @r only has 10 items..  Bit che finishes the script for that http request. Then it checks for that more pages string, if found, generates the URL for the next HTTP request.. Fetches the next pages, and then restarts the script.. So if you press the 'run' button, it should go fetch the next page and then pause in the debugger at the start of the script of the 2nd page (2nd set of 10 items would then go to @r again..).   Not sure if that helps?? Or.. Are you saying it never fetches the 2nd page at all even though the debugger says 'found' for the more pages?


And I'll look into that flicker bug, I can reproduce it so I should be able to fix it.


And for magnets, they don't contain seeds/leeches.. Those can only be obtained by the torrent downloader once it connects to the tracker and gets an update (bit che can do that on the Torrent Details dialog using the Scrape button.. But it would not make sense to have the script launch Scrape requests for each result it finds when in a search).








  -  https://convivea.com  -   And...  boom goes the dynamite.

Offline CuF

  • Sr. Member
  • ****
  • Posts: 331
  • Karma: +40/-0
    • View Profile
Re: multi-page support
« Reply #14 on: January 16, 2015, 02:37:06 pm »
After the 10th @r, even though MorePages says:
Code: [Select]
MorePages:
Find: >Next →</a>
Result: Found
Current Counter: 0
Current Page: http://btdigg.org/search?info_hash=&order=0&q=test&p=0
Next Counter: 1
Next Page: http://btdigg.org/search?info_hash=&order=0&q=test&p=1
All debugger buttons deactivate/turn grey.

Similarly, without the debugger only 10 results are found.

As far as the Magnet links, I understand about scraping them.  I was mostly checking to make sure setting the seeders/leechers to 1 was the preferred scripting for that.  I thought maybe BC recognized a link as Magnet and automatically did something with the seeds/leeches so put(1) isn't necessary.