politics
Welcome to the discussion of US Politics!
Rules:
- Post only links to articles, Title must fairly describe link contents. If your title differs from the site’s, it should only be to add context or be more descriptive. Do not post entire articles in the body or in the comments.
Links must be to the original source, not an aggregator like Google Amp, MSN, or Yahoo.
Example:
- Articles must be relevant to politics. Links must be to quality and original content. Articles should be worth reading. Clickbait, stub articles, and rehosted or stolen content are not allowed. Check your source for Reliability and Bias here.
- Be civil, No violations of TOS. It’s OK to say the subject of an article is behaving like a (pejorative, pejorative). It’s NOT OK to say another USER is (pejorative). Strong language is fine, just not directed at other members. Engage in good-faith and with respect! This includes accusing another user of being a bot or paid actor. Trolling is uncivil and is grounds for removal and/or a community ban.
- No memes, trolling, or low-effort comments. Reposts, misinformation, off-topic, trolling, or offensive. Similarly, if you see posts along these lines, do not engage. Report them, block them, and live a happier life than they do. We see too many slapfights that boil down to "Mom! He's bugging me!" and "I'm not touching you!" Going forward, slapfights will result in removed comments and temp bans to cool off.
- Vote based on comment quality, not agreement. This community aims to foster discussion; please reward people for putting effort into articulating their viewpoint, even if you disagree with it.
- No hate speech, slurs, celebrating death, advocating violence, or abusive language. This will result in a ban. Usernames containing racist, or inappropriate slurs will be banned without warning
We ask that the users report any comment or post that violate the rules, to use critical thinking when reading, posting or commenting. Users that post off-topic spam, advocate violence, have multiple comments or posts removed, weaponize reports or violate the code of conduct will be banned.
All posts and comments will be reviewed on a case-by-case basis. This means that some content that violates the rules may be allowed, while other content that does not violate the rules may be removed. The moderators retain the right to remove any content and ban users.
That's all the rules!
Civic Links
• Congressional Awards Program
• Library of Congress Legislative Resources
• U.S. House of Representatives
Partnered Communities:
• News
view the rest of the comments
Lol 5000 pages thrown at the likes of a cutting edge LLM is like a plank length of reading.
Glad technology is crushing that dumb loophole.
Edit: Lol people downvoting this are a good 6 months behind AI news.
You cannot rely on a LLM to summarise accuratly
Plus, that’s not a good task for an llm because its context window would almost certainly be too short.
It would “hallucinate” because it could only “remember” a fraction of the content and then everyone would be all pissy because they used the program wrong.
I mean you can pretty simply just engineer around that. Dumping 5k pages is obviously an idiotic way of approaching the issue. But having an LLM going through 500 words at a time, with 125 words of overlap in each sequence to pull out key words, phrases, and intentions, then put that into a structured data form like a JSON. Then parse the JSONs to pick up on regions where specific sets of phrases and words occur. Give those sections in part or entirely to the LLM again; again have it give you structured output. Further parse and repeat. Do all of these actions several times to get a probability distribution of each assumption around what is being said or is intended. Build the results into a Bayes net, or however you like, to get at the most likely summaries of what the document is saying. These results can then be manually reviewed. If you are touchy, you can even adjust the sensitivity to pick up on much more nuanced reads of the text.
Like, if the limit of your imagination is throwing spaghetti against a wall, obviously your results are going to turn out like shit. But with a bit of hand holding, some structure and engineering, LLM's can be made to substantially outperform their (average) human counter parts. They do already. Use them in a more probabilistic way to create distributions around the assumptions they make, and you can set up a system which will vastly outperform what an individual human can do.
(just asked up the thread:)
GPT-4 & Claude 3 Opus have made little summarization oopsies for me this past week. You’d trust ‘em in such a high profile case?
if you end them 100 times over the same text?
This is court, not a school project or academia. But in general I agree with you.
No this is discovery and we're discussing how you would engineer a system to support automating it.
LLMs are still pretty limited, but I would agree with you that if there was a single task at which they can excel, it's translating and summarizing. They also have much bigger contexts than 500 words. I think ChatGPT has a 32k token context which is certainly enough to summarize entire chapters at a time.
You'd definitely need to review the result by hand, but AI could suggest certain key things to look for.
People were doing this somewhat effectively with garbage Markov chains and it was 'ok'. There is research going on right now doing precisely what I described. I know because I wrote a demo for the researcher whose team wanted to do this, and we're not even using fine tuned LLMs. You can overcome much of the issues around 'hallucinations' by just repeating the same thing several times to get to a probability. There are teams funded in the hundreds of millions to build the engineering around these things. Wrap calls in enough engineering and get the bumper rails into place and the current generation of LLM's are completely capable of what I described.
This current generation of AI revolution is just getting started. We're in the 'deep blue' phase where people are shocked that an AI can even do the thing as good or better than humans. We'll be at alpha-go in a few years, and we simply won't recognize the world we live in. In a decade, it will be the AI as the authority and people will be questioning allowing humans to do certain things.
Read a little further. I might disagree with you about the overall capability/potential of AI, but I agree this is a great task to highlight its strengths.
Sure. and yes I think we largely agree, but on the differences, I seen that they can effectively be overcome by making the same call repeatedly and looking at the distribution of results. Its probably not as good as just having a better underlying model, but even then the same approach might be necessary.
Exactly. There is already one recent case where a lawyer filed a brief generated by an LLM. The judge is the one that discovered the cited cases were works of fiction created by the LLM and had no actual basis in law. To say that the lawyer looked foolish is putting it lightly…
Right, but that's not what we're talking about here - we're not saying "Hey LLM, write a convincing sounding legal argument for X", we're saying "Hey LLM, here's a massive block of text, summarize what you can and give me references to places in the text that answer my questions so I can look at the actual text as part of building my own convincing sounding legal argument for X."
It's the difference between doing a report on a topic by just quoting the Wikipedia article, versus using the Wikipedia article to get a list of useful sources on the topic.
But you can use it as a tool to assist. If it finds something actionable, you can confirm the old-fashioned way, by doing the actual reading.
until it just makes stuff up, as they have done
I’ve used AI. I’ve had it make stuff up or put incorrect info into documents. I’m smart* and read through the document just like these lawyers will**. Saved me TONS of time vs just doing all the specific writing, scanning and summarizing. *citation required **smartness not withstanding
There are specialized LLMs that (if the document is digitized) will actually cite their references within the data they've been given and provide direct links. It'd still need proof reading as someone would have to check those citations but it would still speed up the process immensley.
You can't rely on people to summarize it accurately either. Humans make mistakes too. The difference is that I can ask an LLM to do the summarizing 10x, and calculate a statistical probability of a given statement being present or true in the text, at a very low cost. Just because LLM's aren't 100% reliable doesn't make humans the best bar to rely upon either.
You do understand that they're talking about paper, right? Even if you were feeding it to an LLM -- and you wouldn't be, because that would be legal malpractice -- it would take a non-trivial amount of time just to scan it in!
It's the legal equivalent of paying somebody with a wheelbarrow of pennies.
I think when we discuss large volumes of paper it is often the case that much of it is irrelevant and not overly hard to sort an analyze. EG he is asserting that its impossible for him to afford to do this. You don't need to actually keep reading the statements of his resources to each of the 30 institutions he applied to nor all the refusals unless its likely that something therein may be meaningful. We can probably read ONE and skim another and conclude that the statement that he can't raise the bond by pledging encumbered real estate he's constantly lied about won't work.
You do understand that AI is fully capable of reading paper, instantaneously?
Digitizing books is childs play
People downvoting me in this chain are months if not years behind AI news. Paralegals won't have jobs in 3 years. Lawyers won't have jobs in 5-10.
I think you're only partially right about paralegals, but lawyers will be fine because of how the profession is protected. It's essentially a guild system, where you have to be a part of the lawyer's guild (aka the bar) to legally be allowed to lawyer. And AI cannot join regardless of how good it is because lawyers want to keep their jobs. It would take legislation breaking the requirement to be a member of the bar to lawyer to change that, but the people writing legislation are themselves mostly members of the bar.
I won't disagree but, I mean, if I'm a lawyer and I have a law firm, I'd rather split my millions with me and my robots. And I think there's enough like minded greedy lawyers running law firms to set it in motion.
Except instead of you having to split your revenue with your fellow lawyers and having the work split among hundreds of similar firms, you now don't have to split it, but the available lawyering work is split among everyone who can buy a chunk of compute. Unless you being an actual human lawyer is still advantageous, in which case we wouldn't be at the point where AI is actually replacing lawyers.
You're refuting my comment about how humans have to laboriously scan in the documents with... a video of a human laboriously scanning in a document?
For 5000 pages, we're still talking about hours of human labor just to operate the scanner, even if it's a fast one.
No we aren't. They are automated.
And actual robots are currently capable of operating them. Completely autonomously.
Again, y'all are months, if not years behind AI news.
Your own video showed a fucking human, dude.
https://www.youtube.com/watch?v=cmhIJOqepVU
Just google it. This is just the first result, normally you'd remove the spine so you don't have to turn the pages. The book in the other video is a special one that should not be destroyed, and since that fancy shmancy thing from my link is probably more expensive than my socks, it's done manually.
It was foggy's job to support his argument, not mine. He should've done a better job (e.g. by citing the video you found instead of the manual one he picked).
Also, I wrote that it would take "hours" to scan in 5000 pages, even with a fast scanner. The scanner you cited can do 3000 pph, so it would take 1.6 "hours" to scan 5000 pages. That's still a plural number of hours, so if that's the fastest scanner in the world my statement remains technically correct (the best kind of correct 🤓).
Finally, even a sheet-feed* very fast automatic document scanner (especially one hooked to an LLM in an automated workflow) sounds like a pretty expensive and specialized bit of tech, and I don't know that we can assume the law firm would've chosen to make that investment instead of paying clerks a bunch of man-hours to do it the old, slow way.
(* Frankly, citing a book scanner instead of a sheet-feed one is another way foggy didn't do his argument any favors, since I would've been happy to concede that the documents Trump's lawyers produced were unlikely to have been bound in book form. And even if they were bound for some reason, they weren't the kind of thing anybody would have qualms against running through a band saw to get rid of the spine.)
It's also over a year old.
...Again, y'all are months, if not years behind AI news.
I don't think lawyers/Judges/procesecutors in a high profile multimillion dollar fraud case are using AI. This would be something they are used to and know how to deal with. And I don't think this size of report would be out of the ordinary for a case like this. A lot of it probably doesn't need to be read but is included for completeness. For example, only a few transactions over the course of a few years may be needed to prove fraud. But the entire transaction list from that time would be included as an appendix for reference.
The smart ones absolutely are using AI. The Judges might not, but the lawyers and prosecution certainly are. They don't have to directly cite AI, but can simply use it to point out the salient bits and save themselves a LOT of time digging for info that they want.
I don't know much about AI but wouldn't a "simple" pattern recognition software do a better job of eliminating unnecessary copies of email chains?
No need to summarize everything if you can just cut the waste.
GPT-4 & Claude 3 Opus have made little summarization oopsies for me this past week. You’d trust ‘em in such a high profile case?