C from erlang via linked-in driver Comments http://www.chrisumbel.com/article/c_from_erlang_curl#comments C from erlang via linked-in driver Sun, 05 Feb 2012 02:36:41 GMT Comment by Mazen Harake on Fri Nov 06 2009 09:11:54 GMT+0000 (UTC) Nice intro, except for the 'noop' thing which is pretty ugly actually :) The rest was a really nice intro though. /Mazen http://www.chrisumbel.com/article/c_from_erlang_curl#comments_4c4b2ecf9453131804000027 http://www.chrisumbel.com/article/c_from_erlang_curl#comments_4c4b2ecf9453131804000027 Fri, 06 Nov 2009 09:11:54 GMT Comment by chrisumbel on Wed Nov 11 2009 09:11:41 GMT+0000 (UTC) Hey, thanks for the feedback. You're certainly right about that. Can't imagine I intended to leave it that way. I'm certainly cleaning that up. http://www.chrisumbel.com/article/c_from_erlang_curl#comments_4c4b2ecf9453131804000028 http://www.chrisumbel.com/article/c_from_erlang_curl#comments_4c4b2ecf9453131804000028 Wed, 11 Nov 2009 09:11:41 GMT Comment by Michael Terry on Tue Jun 22 2010 00:06:46 GMT+0000 (UTC) I've heard of another project using libcurl with its erlang crawler. Did you try ibrowse? Is there any theoretical benefit to libcurl over ibrowse (or any native erlang client, assuming it works)? How was the http module failing under pressure? http://www.chrisumbel.com/article/c_from_erlang_curl#comments_4c4b2ecf9453131804000029 http://www.chrisumbel.com/article/c_from_erlang_curl#comments_4c4b2ecf9453131804000029 Tue, 22 Jun 2010 00:06:46 GMT Comment by chrisumbel on Wed Jun 23 2010 04:06:19 GMT+0000 (UTC) eee, it's been a while and have had little reason to look at the crawler in months but I dealt with some kind of persistent failure (I realize that's not useful information:)) that exhibited itself with extreme parallelization (which is why I went to erlang in the first place). I also do recall mucking with ibrowse and ultimately dismissing it while prototyping the project for some reason or another. Maybe over the next week or so I'll dig through the project and see if I can refresh my memory. Believe-you-me I'd love to keep that as pure-erlang as possible and, assuming it worked, I'd prefer a pure-erlang implementation. Also, keep in mind the purpose of the article. It's a description of how to write a linked-in driver in general, not http fetching. The curl-based http fetch was just an example payload even if it turns out to be contrived. http://www.chrisumbel.com/article/c_from_erlang_curl#comments_4c4b2ecf945313180400002a http://www.chrisumbel.com/article/c_from_erlang_curl#comments_4c4b2ecf945313180400002a Wed, 23 Jun 2010 04:06:19 GMT Comment by Michael Terry on Thu Jun 24 2010 00:06:13 GMT+0000 (UTC) Oh, sure, no problem. I was just curious. As a linked-in driver example, it's great, and actually more interesting to me than others I've seen. http://www.chrisumbel.com/article/c_from_erlang_curl#comments_4c4b2ecf945313180400002b http://www.chrisumbel.com/article/c_from_erlang_curl#comments_4c4b2ecf945313180400002b Thu, 24 Jun 2010 00:06:13 GMT Comment by cignos on Fri Jul 23 2010 08:07:41 GMT+0000 (UTC) It seems that C I/O functions block OS thread rather than erlang process. So, concurrent application of eurl:curl() seems to be serialized. Running ex1:run() with large concurreny (M) and slow respnding URL (Url) in following code illustrate this problem. ("curl_easy_setopt(curl, CURLOPT_TIMEOUT, 10);" in eurl.c kills Erlang VM on slow page due to timeout. So, running this test requires it disabled.) -module(ex1). -compile(export_all). run(M, N, Url) -> eurl:start(), spawn_testers(M, N, Url), receive _ -> ok end. spawn_testers(0, _, _) -> ok; spawn_testers(M, N, Url) -> spawn(?MODULE, tester, [N, Url]), spawn_testers(M - 1, N, Url). tester(N, Url) -> io:format("Start tester: ~p ~p~n", [now(), self()]), tester_loop(N, Url). tester_loop(0, _) -> ok; tester_loop(N, Url) -> eurl:curl(Url), io:format("~p: ~p ~p~n", [N, self(), now()]), tester_loop(N - 1, Url). http://www.chrisumbel.com/article/c_from_erlang_curl#comments_4c4b2ecf945313180400002c http://www.chrisumbel.com/article/c_from_erlang_curl#comments_4c4b2ecf945313180400002c Fri, 23 Jul 2010 08:07:41 GMT Comment by Michael on Sun Aug 01 2010 17:06:36 GMT+0000 (UTC) Hi Chris, You said: "I've heard of another project using libcurl with its erlang crawler". Which project is it? http://www.chrisumbel.com/article/c_from_erlang_curl#comments_4c55a99c75fd7f7c0200002c http://www.chrisumbel.com/article/c_from_erlang_curl#comments_4c55a99c75fd7f7c0200002c Sun, 01 Aug 2010 17:06:36 GMT Comment by Michael on Sun Aug 01 2010 17:09:05 GMT+0000 (UTC) Hi cignos, You note about blocking the OS thread is interesting. How can we fix Chris's C code then? http://www.chrisumbel.com/article/c_from_erlang_curl#comments_4c55aa3175fd7f7c0200002e http://www.chrisumbel.com/article/c_from_erlang_curl#comments_4c55aa3175fd7f7c0200002e Sun, 01 Aug 2010 17:09:05 GMT Comment by cignos on Tue Aug 03 2010 03:06:51 GMT+0000 (UTC) Hi, Michael, Well, I searched the solution in Internet and as a matter of course, I found it is provided erlang thread wrapper version of blocking C functions, as in GNU pth, State Threads Library, etc. http://www.chrisumbel.com/article/c_from_erlang_curl#comments_4c5787cb75fd7f0f16000027 http://www.chrisumbel.com/article/c_from_erlang_curl#comments_4c5787cb75fd7f0f16000027 Tue, 03 Aug 2010 03:06:51 GMT Comment by chrisumbel on Fri Aug 06 2010 12:04:44 GMT+0000 (UTC) Michael, it was actually another poster who mentioned another project using libcurl. I too am interested in knowing which project. Starting to look like I need threaded comments around here:) http://www.chrisumbel.com/article/c_from_erlang_curl#comments_4c5bfa5c75fd7f3e43000008 http://www.chrisumbel.com/article/c_from_erlang_curl#comments_4c5bfa5c75fd7f3e43000008 Fri, 06 Aug 2010 12:04:44 GMT Comment by chrisumbel on Tue Sep 21 2010 01:59:14 GMT+0000 (UTC) While it should still be considered a work in progress I changed this to use libcurl's multi interface to avoid blocking on I/O. http://www.chrisumbel.com/article/c_from_erlang_curl#comments_4c98117275fd7f6942000012 http://www.chrisumbel.com/article/c_from_erlang_curl#comments_4c98117275fd7f6942000012 Tue, 21 Sep 2010 01:59:14 GMT Comment by Mattress Review on Sat Dec 10 2011 09:10:18 GMT+0000 (UTC) I suggest adding a "google+" button for the blog! Hellen http://www.chrisumbel.com/article/c_from_erlang_curl#comments_4ee321fa561491527420e297 http://www.chrisumbel.com/article/c_from_erlang_curl#comments_4ee321fa561491527420e297 Sat, 10 Dec 2011 09:10:18 GMT