Questions tagged [machine-learning]

How can we build computer systems that automatically improve with experience, and what are the fundamental laws that govern all learning processes?

From The Discipline of Machine Learning by Tom Mitchell:

The field of Machine Learning seeks to answer the question "How can we build computer systems that automatically improve with experience, and what are the fundamental laws that govern all learning processes?" This question covers a broad range of learning tasks, such as how to design autonomous mobile robots that learn to navigate from their own experience, how to data mine historical medical records to learn which future patients will respond best to which treatments, and how to build search engines that automatically customize to their user's interests. To be more precise, we say that a machine learns with respect to a particular task T, performance metric P, and type of experience E, if the system reliably improves its performance P at task T, following experience E. Depending on how we specify T, P, and E, the learning task might also be called by names such as data mining, autonomous discovery, database updating, programming by example, etc.

3322 questions
2
votes
1 answer

Is "nonuniform learnability" only a necessity condition for "PAC learnability"?

When reading "Understanding Machine Learning" by Shai Shalev-Shwartz and Shai Ben-David, I encountered confusion about the relationship between nonuniform (NU) learnability and PAC learnability. The notations below follow "Understanding Machine…
m_water
  • 21
2
votes
0 answers

Confusion related to convexity of a problem

I was reading this paper related to Multiclass Classification with Multi-Prototype Support Vector Machines - paper However, I am having difficulty in understanding why they have mentioned the following problem non convex. I am really struggling…
user34790
  • 4,192
2
votes
0 answers

Hamiltonian Monte Carlo overestimating variance - how fix?

I have implemented Hamiltonian Monte Carlo. To test the effectiveness of my implementation, I have run it against a normal random variable. After $n$ number of steps, I compare the sample mean and variance of the HMC output against the true mean…
2
votes
0 answers

Using image vs position and velocity states for reinforcement learning

I am developing a 2d car simulator to use DRL to find optimal path from initial to target position. Since it is a continuous space, I am using DDPG method like actor-critic. Is it a good idea to feed the whole image (showing the car, obstacles and…
VP Lex
  • 137
2
votes
1 answer

Can gradient descent be used to find value of exponent?

I'm experimenting with machine learning and I'm trying to develop a model that'll find the exponent that the input will need to be raised to in order to result in the output. For example, if input=$[0, 1, 2, 3]$ and output=$[0, 1, 8, 27]$ then the…
Badr B
  • 631
  • 5
  • 13
2
votes
2 answers

A few problems understanding Adaboost

I am studying the following Adaboost algorithm: $for (t=1,2,...,T):$ $h_t = argmin_{h\in H} \Sigma_{i=1}^m(D_i^t\cdot1_{[h(x_i)\neq y_i]}) $ $\epsilon_t = \Sigma_{i=1}^m(D_i^t\cdot1_{[h_t(x_i)\neq y_i]})$ $w_t =…
Shaq
  • 468
2
votes
0 answers

Clarification: binary cross entropy derivation

I'm learning about cross-entropy in the context of machine learning and I've stumbled across a notation problem I'm not sure about. As I understand, when training a machine learning model using MLE, the goal is to minimize the dissimilarity between…
2
votes
1 answer

When will empirical risk minimization with inductive bias fail>

I am working on the assignment and I am stucked on this problem: Give an example of a class H, some domain space X, a distribution P over X × { 0, 1}, and an ERMH learning algorithm, A, such that for some h*∈ H, for every sample size m, h* is a much…
2
votes
0 answers

How can I split an ordered set of data by a classification for purity?

This question relates to machine learning and the creation of decision trees. It's been about 5 years since I've done anything related to sets (or math in general), so please forgive my lack of proper symbols, notation and vocabulary and/or…
Rolan
  • 121
2
votes
1 answer

Derivation of simplified form derivative of Deep Learning loss function (equation 6.57 in Deep Learning book)

In the book "Deep Learning" by Ian Goodfellow, Yoshua Bengio and Aaron Courville there is a derivation of the derivative of a loss function that has a very simple form, but is apparently tricky to derive automatically. I have tried to derive the…
GEW
  • 23
2
votes
1 answer

The proof for the value of growth function for convex sets

I reinforce my education through self-study of Machine Learning. When I come across the problem of Growth Function generated by Convex Set I see only the result and skimming over a proof. I have to accept that $m_{H}(N)=2^{N}$, where $H$ consists of…
2
votes
2 answers

Data Space vs parameter space

Could someone elaborate on what is the difference between the two, perhaps with a use of a simple example? I am a bid confused as I always thought they were connected...
Bober02
  • 2,546
2
votes
2 answers

Machine learning Linear regression cost function

I am doing a project in deep learning and I have been taking Andrew's machine learning course from youtube. I am having difficulty in understanding the working of cost function. given the equation below J(θ)=minθ 1/2 i=1∑m (hθ(x(i))−y(i))2 where m…
1
vote
2 answers

Mathematics disciplines underpinning Machine Learning

I have an undergrad degree in computational mathematics (though that was about 10 years ago), and spent my professional career in software development. If I wanted to understand what's happening behind the scenes in ML, and not just blindly apply …
kolosy
  • 133
1
vote
1 answer

How do you compute the offset parameter for an SVM from the dual solution?

If the plane equation for an SVM is: $$\theta \cdot x^{(i)} + \theta_0$$ How do you compute $\theta_0$ from the dual solution? What I have so far is, for every support vector (SV) $x^{(t)}$ we have: $$y^{(t)} (\theta \cdot x^{(t)} + \theta_0) =…